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The effects of a summer science program for high school students was evaluated using matched longitudi- — ^ 
nal data from two national samples of students, Among white students of both sexes, participation in the pro- 

gram appears to increase interest in majoring in science in college, inpursuing a career as a scientist, andin 

- going on for the PhD. degree, Although similar, but less pronounced, effects were observed among black male 
participants, no such effects were observed among black female participants, r 


ABSTRACT 


| FOR SEVERAL years now the National Science tellectually rewarding programs of study, either 
Foundation ( NSF ) has sponsored a student science through formal class and laboratory work or by 

_ training program (SSTP) forhigh ability secondary means of research programs of suitable difficulty.” 
School students. The primary purpose of this pro- Although selection of students is the responsibility 

| gram is to stimulate the scholary development and of the institution submitting the project proposal, 

| Scientific interests of science-oriented high school the Foundation ordinarily rejects proposals in which 
Students by means of direct experience with college- the students are drawn largely from a single high 
level instruction and research. The program also School or are enrolled in a regular college summer 
Seeks to encourage the development of similarpro- Session course, 


grams through other sources of support. 
A given project ordinarily lasts from 5 to 11 


The purpose of this study was to determine the weeks, Although a few students participate during 
| characteristics of students who are selected to par- the academic year, the large majority participate 
ticipate in the SSTP and to assess the impact. of the during the summer between the juniorand senior 
| program on the student's achievement, career plans, years in high school. Most projects offer in-depth 
‘and scientific interests, instruction in specific scientific subjects. A smaller 
number are more research-oriented, with the stu- 
.NATURE OF THE PROGRAM dent assuming the role of a junior member of a re- 
4 ‘ search team under the supervision of a senior sci- 
According to a recent NSF brochure ( 4 ) , proj- entist, A few projects combine both classroom 
ects can be designed for either of two types of stu- work and research experience. Instructional costs 
dents: ( 1 ) those “ from secondary schools in which are borne by the NSF, although most students are 
. Science instruction is, by national standards, satis- expected to provide for their own living and travel 
factory or better’’ or (2 ) those ** with limited expenses. There is, however, financial aid avail- 
educational opportunities who have demon- able for students who are unable to participate be- 
strated high potential for academic achievement, but cause of limited financial resources, 
in whose secondary schools science training is de- 
ficient because of inadequate facilities or instruc- EVALUATION OF THE PROGRAM 
‘tion,’’ Both types of projects, however, are designed | 
for high-ability students, The emphasis in both pro- A crucial consideration in any decision to per- 


grams of study, “ should be on substantial and in- petuate, expand, or terminate educationalprograms 
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of this type is their impact on the students who par- 
ticipate. Unfortunately, the opportunities for em- 
pirical evaluations of such action programs are typ- 
ically limited, because the design of the initial 
program does not provide for a research or eval- 
uation component. In the case of the SST Р, 
however, an unique Opportunity to carry out an 
empirical evaluation was afforded by the fact that 
data on large national samples of students were 
available from two sources: the national testing pro- 
gram of the National Merit Scholarship Corporation 
( NMSC ) and the Cooperative Institutional Research 
Program of the American Council on Education (A- 
CE). Since the National Merit Scholarship Qual- 
ifying Test (NMSQT) is administered during the 
junior year of high school ( before most students 
Participate in the SSTP), and since the ACE’s sur- 
vey is administered during the first few weeks of 
college ( after participation in the SSTP), we can 
estimate the impact of the NSF Program by exam- 
ining changes over time in those students who hap- 
pen to be in both the NMSC and ACE samples, 


Selecting the Samples 
TS е Samples | 


Spring of 1966, at which time the NMSC also includ- ` 


ed a brief stude 
materials, 
ed questions about the student's vocational interests, 
educational aspirations, and career plans—informa- 


А total of 280, 650 entering freshman Students 
at 359 colleges and universities participated in the 
Survey of entering freshmen conducted by the ACE 

In addition to the usual questionnaire in- 


poses of identification: name, address, date of birth, 
These same items of in- 


formation were also available from the question- 


naire completed by students who took the NMSQT 


18 months earlier, This latter sampl i 
approximately 800, 000 students, ымыны 


In order to identify students i 
the two Samples, the t tae 


t wo files of identifyin, infor- 
mation ( ММС and ACE ) wer in fl 
lowing order: Sex, date of bj i ates Ce 


name, first initial, апа middle initia], The two 


Sorted files were then matched. Any pair of rec- 
ords that matched exactly in terms of all matching 
criteria were considered to be from the same stu- 
dent. Although there is a finite probability of a few 
mismatches under these conditions ( students with 
very common last and first names who were born 
in very populous states, for example) , the number 
is probably very small. Judging from the small 
proportion of identical consecutive records in the 
National Merit File ( less than 1 in 5,000), thenun 
ber of actual mismatches among those assumed to 
be matches is probably far less than 1 percent. ES 
dents who could be matched exactly in terms орле. 
but not all, of ће criteria were listed separately fo 
visual inspection, Several additional ** matches 
were identified in this manner ( for example, E 
dents who reported a middle initial in one of the en 
ings but not in the other). Inalla total of 102,295 
matches were identified, Although substantial in 51 
this sample is somewhat below the proportion O f 
freshmen from the ACE sample who would be 277 
pected to һауе taken the National Merit Test ( 36-4. 
percent matches as compared to an expected о, Р 
lap of approximately 55 percent), However, рио i 
a loss is to be expected, considering the stringen 
criteria employed for matching and the relatively 
high probability that students will not report Dum 
rately at least one of the items of identifying oe 
at least one of the testings, The bias коке 
these stringent matching procedures would, о repo Я 
tend to exclude students who are reluctant аера 
complete identifying information as well aS st coma 
who report such information inaccurately or in 


. pletely. 


SSTP participants were identified by means ұт 
ап item included оп the АСЕ freshman questionna sg 
A total of 2,018 students ( 2 percent of the matene 
Sample) indicated that they had been participan B 
an NSF Summer Program. In addition to these екі 
TP participants, two “© control" groups were § k 
lected from the sample of matched Ss: all uar 
students (N= 3,003) and every fifteenth nonblac m 
Student ( N = 6,484)2, The latter subsample, га sd 
er than all of the remaining nonparticipants, wa 
lected in order to reduce computing costs. 


Characteristics of the Samples 
TETRCIEIISUCS of the Samples. 

Table 1 shows the sexual and racial compositi 
of the SSTP participants and nonparticipants among 
the matched Ss and also among all entering colle£ 
freshmen of 1967 ( all data 


er interests in 
( see Table 1 ) A o 
the program participants could be equalized with ™ 
appreciable loss in either the level of talent or tP 


dents, 


r 
The data in Table 1 revea] an interesting and ре 
haps unexpected finding concerning the racial coy 
Position of the NSF participant group: the propor 
оп ОГ blacks is more than twice as great as it 15 
among the nonparticipants, and also substantially 
higher than it is among all entering college 


|» 


ASTIN 


TABLE 1 


SEX AND RACE OF NSF SUMMER PROGRAM PARTICIPANTS AND NONPARTICIPANTS 


Entering Freshmen Who Took the NMSQT 


| Indian 4 8 4 2 


NSF Participants Nonparticipants All Freshmen Entering College 
CN = 2,018 ) (N = 100,277) In Fall 1967* 
Male Female Total Male Female Total Male Female Total 
Percent 
Male 69.7 53.6 55.6 
White 91.6 81.5 88.4 94.5 92.2 93.4 90.1 89.6 89.9 
Black 5.1 1217 7.5 2.2 3.9 3.0 3.9 4.8 4.3 
Oriental 1.7 3.8 2.3 9 8 9 .9 wt .8 
Pec: .3 .2 .6 A at 


ж From Panos, Astin, and Creager (5 ). 


Category (4 other ” ) is not shown here. 


freshmen, There also appears to be a slight over- 
respresentation of Oriental students among partici- 
pants, Considering the fact that blacks typically 
score below whites on the types of selection criteria 
normally used in such programs ( grades andtest 
Scores, for example) (2), the high proportion of 
black students in the participant group suggests that 
there may have been a conscious effort to recruit 
such students into the program, Such recruiting 
would, or course, be consistent with the second 
SSTP selection criterion, i.e., to make the program 
available to students whose opportunities for science 
| training have been limited, 


" Table 1 also affords an opportunity for us to com- 
pare the large sample of matched Ss with all fresh- 
men entering college in the fall of 1967. While the 
sex ratio among the matched sample is very close 
to that of all entering freshmen ( 53.6 percent ver- 
sus 55, 6 percent) , the proportion of whites appears 
to be Slightly higher and the proportion of blacks 
slightly lower than is found among all entering fresh- 
men. This difference in racial composition could 
reflect a bias in the schools and students who parti- 
cipate in the National Merit Scholarship Program, 
or it might also reflect racial differences in the ac- 
curacy of reporting the necessary identifying infor- 
mation, 

In viewof the sexual and racial biases in the 
sample of NSF Program participants, all of the anal- 
yses to be reported subsequently have been perform- 
ed separately by sex and race, Analyses involving 
nonblack nonparticipants, however, shall be based 
on the subsample of 6, 484 students rather than on 
the entire sample of nonblack nonparticipants. Where- 
ever the analyses involve data from the ACE fresh- 
man questionnaire, we shall also report national nor- 
mative data for all entering freshmen of 1967 (5 ), 


Ote: Percentages in each column sum to less than 100 across the four racial categories because a fifth 


Table 2 shows selected background character- 
istics of the four participant and four nonparticipant 
groups, as well as national normative data for all 
entering freshmen of 1967, With respect to age, 
all groups of NSF Program participants—boys and 
girls, both black and white—are younger than non- 
participants. These age differences are especially 
pronounced among the black males. Thus, although 
blacks tend to be older than whites among the non- 
participating boys, the reverse is true among the 
participants: the blacks are somewhat youngerthan 
the whites, A similar trend is observable among 
the girls, where the proportion of students below 
age 18 is considerably higher among the black than 
among the white participants. However, it should 
be noted that the black girls are more variable with 
respect to age than are the white girls: proportion- 
ately more blacks are also above age 18 among the 
girl participants. 


The data in Table 2 also indicate that, without 
exception, participants come from more highly ed- 
ucated and more affluent families than do nonparti- 
cipants, The differences in parental education are 
somewhat larger in size than the differences in fam- 
ily income. As might be expected, the education 
and income levels of the parents of black students 
are consistently below those of the parents of white 
students. The differences in family income are par- 
ticularly striking, with the proportions of families 
with incomes below $6,000 three to four times 
greater among the black than among the white 


students. 


An interesting anomoly in the data concernsthe 
relative levels of education of the mothers and fa- 
thers of the students, Fathers of the white students 
are consistently more likely than are their mothers 
to have a college degree. Among the black students, 
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TABLE 2 


BACKGROUND CHARACTERISTICS OF NSF SUMMER PROGRAM PARTICIPANTS AND NONPARTICIPANTS 
BY RACE AND SEX 


Boys Girls 
See SG O 
Back- White Black White Black 
ground All Freshmen Ente 


Charac- Parti- Nonparti- Parti- Nonparti- Parti- Nonparti- Parti-  Nonparti- ing College in Fall 


teristic cipants cipants — cipants cipants cipants сірапіѕ cipants cipants 1967ж 


Percent 
1Тог 


younger 13.8 6.9 22.1 10.8 13.4 8.8 28.6 15.6 4.8 


Percent 
19 or 


older 7.5 11.8 4.5 17.4 4.1 6.6 5.2 9.4 18.5 


Parent has 
a College 
Degree: 


mothers 36,2 21.6 35.8 18,9 33.5 24.2 31.6 19.2 17.8 


fathers 49,6 34.0 30.3 18.0 41.6 37.8 25.0 17.2 26.4 , 


$6,000 9.0 10.9 371.9 42.5 9.8 12.6 39.7 44.3 13.9 


Percent 


orhigher 65.8 55.8 34.4 25.6 61.1 56.1 29,4 23.0 59,2 


Parents: 


Percent 
Protes- 
tant 52.4 51.7 15.4 62.6 55.8 53.3 73.7 66.7 


Roman 

Catholic 19.7 
Jewish 20.2 
“none” 4.3 
Percent 

Living in: 

New 

England 4.8 1.5 40 3.0 6.8 8.3 .0 1.8 5.7 
Middle ! 
Atlantic 30.0 25.1 16.4 26.9 25.0 25.4 
North 

Central 30.4 87.7 7.8 
Northwest 6,1 5.7 .0 1.4 
South 20.3 17.3 Tu 
Westand 

Southwest 8.4 6.6 1.5 3.8 10.0 


* From Panos, Astin, and Creager (5 ) 


NJ 


ASTIN 
TABLE 3 
NMSQT SELECTION SCORES OF NSF SUMMER 
PROGRAM PARTICIPANTS AND 
NONPARTICIPANTS BY SEX AND RACE 
Percentage Scoring Percentage Scoring 
Above 105 Above 130 
Partici- Nonpar- Partici- Nonpartici- 
Group pants ticipants pants pants 
All Fresh- 
men Enter- 
ing College 
Fall 1967* 50.0 9.1 
Boys 
White 93.0 70.1 65.1 22.1 
. Black 41.8 25.2 10.3 3.4 
Girls 
White 89.9 65.1 57.7 20.4 
Black 39.0 19.3 5.2 1.9 


* Estimated national norms from Astin (1). Data 
for all other **nonparticipant'' groups are unweight- 
ed tabulations from those entering freshmen of 1967 
(5) for whom NMSQT scores could be obtained. 


|, however—participants and nonparticipants alike—the 


mothersare slightly more likely than are fathersto 


. have a college education. A similar trend has been 


Observed by Bayer and Boruch (2) in a recent study 
of black college freshmen. 


Table 2 shows some interesting biases in the 
eligion of the students’ parents, Students from Ro- 
man Catholic parents are consistently underrepre- 
sented in all participant groups. Among the white 
Students, parents with no religious preference and 
(particularly among the men) parents whoare Jew- 
ish are overrepresented in the participant group, 
whereas Protestants are approximately equally re- 
presented in the participant and nonparticipant group. 
Among the black students, Protestants are some- 
what overrepresented among the participants. 


The final category of data in Table 2—the geo- 
graphic region of the students’ home towns— also 
Shows certain differences between participants and 
nonparticipants as well as betweenthe different races. 
Participants, for example, appear to be overrepre - 
sented in the Middle Atlantic and Southern states, 
and somewhat underrepresented in the New England 
states, These biases are particularly evident among 
the black students, where the participants are much 
more likely than the nonparticipants to come from 
the Southern states, 


A final item from from Table 2 should be noted. 
In comparison with all freshmen entering college in 
1967, our sample of matched Ss appears to be some- 
what younger and to have somewhat more highly 


educated parents than the typical college freshman. 
Although the distributions of family incomes and 
home states are quite similar in the matched sam- 
ple and in the national sample of college freshmen, 
the proportion of Jewish parents among the matched 
sample is substantially higher than the proportion 
in the entering-college population, It seemslikely 
that these and other differences in race, age, and 
parental educational level, are in part the resultof 
biases in the secondary schools and students who 
participate in the National Merit Program. 


Selection scores of the various groups on the 
NMSQT are shown in Table 3, Rather than simply 
computing means for each of the groups, we have 
selected two cutting points—105 and 130—to show 
the percentage оѓ students in each group scoring 
above each of these points. The score of 105 is ap- 
proximately the median (fiftieth percentile) for all 
college freshmen, anda score of 130 is near the cutting 
score used by the NMSC to award letters of com- 
mendation and certificates of merit to the partici- 
pants. The data show clearly that, within each of 
the sexual and racial groups, NSF Summer Program 
participants score substantially higher than do non- 
participants, In fact, when all four groups of parti- 
cipants are combined, nearly 60 percent of the par- 
ticipants obtain selection scores above 130. These 
findings show clearly that the Program is fulfilling 
its objectives in terms of selecting students of ex- 
ceptional ability. 


The datain Table 3 also reveal small differences 
favoring all four groups of boys, and substantial dif- 
ferences favoring all four groups of white students, 
This latter result reflects the well-known racial 
differences in performance on tests of academic 
achievement, Nevertheless, it should be stressed 
that, among black students of both sexes, SSTP par- 
ticipants score substantially higher than do nonpar- 
ticipants. 

Some of the high school achievements of the var- 
ious groups of participants and nonparticipantsare 
shown in Table 4, Once again, we see here the su- 
perior academic and scientific accomplishments of 
all four groups of participants. Participants, for 
example, are two to three times more likely than 
are nonparticipants to make A averages in high 
School; very few participants, on the other hand, 
make less than a B- average. Membership in 
scholastic honor societies follows the same pattern, 
with approximately twice as many participants as 
nonparticipants being electedto such societies. The 
percentage of students receiving National Merit 
recognition follows the pattern of test scores re- 
ported in Table 3. One unexpected finding here is 
the relatively high percentages of black students 
who report receiving National Merit recognition; 
the percentage is considerably higher than would be 
expected from their test scores as shown in Table 3, 
One possible explanation here is that the NMSC also 
administers the National Achievement Scholarship 
Program for outstanding Negro students, in which 
the NMSQT is not used as a screening device, 


Participants and nonparticipants differ markedly 
in terms of winning awards in regional or state 
Science contests, The rate of such awards in the 
participant groups is three to five times greater 
than it is in corresponding nonparticipant groups. 
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TABLE 4 


S, BY 
HIGH SCHOOL ACHIEVEMENTS OF NSF SUMMER PROGRAM PARTICIPANTS AND NONPARTICIPANTS, 


RACE AND SEX 


Boys 


High White Black 


Girls 


White Black 


School | _ 
Achieve- Parti- Nonparti- Parti- Nonparti- 
ment cipants cipants сірапіѕ cipants 


Parti- 


cipants — cipants 


АП Freshmen Enter- 
Parti- Мопрагіі- ing College in Fall | 
cipants cipants 1967% | 


Nonparti- 


Percent 
With Ау- 
erage 
Grades of: 
A+,A-,orA 69.1 
Less than 
B- 


22.1 30.9 9.3 73.8 


1.9 15.4 14.7 32.3 1.2 


Percent 
electedto 
Scholastic 
Honor 
Society 74.3 38.7 66.2 


31.8 79.0 


Percent 

receiving 

National 

Merit 

Recognition 51,8 16.9 44,1 


14.2 41.2 


Percent 
Winning 
Award in 
Regional or 
State Science 


Contest 17.3 3.1 23.5 7.4 12.9 


32.7 38.1 14.6 14.4 


1,3 10.5 19.9 30.5 


53.7 45.8 21.1 


15.4 28.6 15.3 1,7 


2.5 20.8 6.8 2,5 


% From Panos, Astin, and Creager (5). 


Table 5 shows the educational and career plans 
of the various groups as reported in the question- 
naire completed when they took the NMSQT during 
the eleventh grade. These data highlight the very 
strong scientific interests of the participant groups, 
as well as their high aspirations for graduate study, 
It is perhaps surprising that, among the white male 
students, the percentage of nonparticipants planning 
careers in engineering (20.1) is not appreciably 
lower than the percentage of participants planning 
such careers (21.7). Among the other three groups, 
however, interest in engineering is somewhat strong- 


er among participants than among nonparticipants, 
It is also 


ticipants 
(medical nursin, harmacy, and so 
forth) are actually lo M 


: rticipants are ге1- 
Stronger in biologi i Я 
and “other medical көбісі scienc е, medicine, 


In summary, 
cate that they ar 
demically, 


our data on NSF participants indi- 
е an exceptionally able group ac? 
In addition, their interests in scie 204 
and aspirations for graduate study are considera t$" 
Stronger than those of other high school studen 
They also tend to be younger, and their parents 2 " 
less likely than other students" parents to be Rom? 
Catholic and more likely to be Jewish or to have D 


formal religion, Although the parents of SSTP st 
dents also tend to be more highly educated and m 
affluent than typic 


Á al parents, the group as a whe ie 
contains a relatively large proportion of black St и 
dents. This latter finding is probably attributabl 
the fact that some of the trainee-ships are inten! 
for students whose educational opportunities hà 
been limited, 


Evaluation Criteria 
=vauation Criteria 


pati” 
assessing the impact of participan / 
те derived from the АСЕ дее (a) 
Two considerations were used: jive 
questionnaire item to the ойе ate 
and (b) availability of apprOP'rgpe 
data from the NMSC questionnaire. 


Criteria for 
in the SSTP we 
questionnaire, 
relevance of the 
of the program; 
“pretest” 


— n"! 


ASTIN 


TABLE 5 


ELEVENTH GRADE EDUCATIONAL AND CAREER PLANS OF NSF SUMMER PROGRAM PARTICIPANTS 


AND NONPARTICIPANTS BY RACE AND SEX 


Boys Girls 
White Black White Black 
Parti- Nonparti- Parti- Мопрагі- Parti- МопрагЧ- Parti- Nonparti- 
Plans cipants  cipants сірапів сірапіѕ cipants cipants , cipants сірапіѕ 
Percent Planning 
Career 
Engineering 21.7 20.1 26.5. 17.4 2.8 .2 1.3 .8 
Physical Science 30.7 9.3 16.2 10.9 22.1 3.7 18.2 6.5 
Biological 
Science 7.8 2.6 8.8 3.7 14.0 2.1 7.8 2.3 
Medicine (MD) 10,2 6.3 14.1 6.8 9.2 2.2 11.7 4.0 
Other Medical 
Field 2.2 5.1 1.5 4.7 7.9 12.0 7.8 12.2 
Percent Liking 
Scientific Re- 
search ‘‘some’’ 
or? much? 92.5 66.8 90.1 67.1 91.6 57.3 80.3 59.1 
Percent Planning: 
PhD Degree 55.7 21.5 52.9 30.1 29.6 7.1 32.5 21.5 


latter requirement was considered essential to en- 
able us adequately to control for initial pre-program 
differences between participants and nonparticipants. ) 
Following these guidelines, five evaluative criteria 
were selected: 


1. Intention to major іп a scientific field. This 
criterion was derived from sixty-six college majors 
listed in the freshman questionnaire, All students 
who checked either engineering or one of the physi- 
cal, biological, social, or behavioral sciences as 
their ** probable field of study’? were assigned а 
score of 1; all other students received a score of 0. 


2. Intention to pursue a career in science. This 
dependent variable was derived from forty-four pos- 
sible careers listed in the freshman questionnaire. 
A student was considered to be pursuing a careerin 
science ( score 1), (a) if he checked ** scientific 
researcher” as his preferred career, or (b) if he 
was intending to major in science ( 1,above) and 
checked either secondary school teacher or college 
teacher as his ** probable future career,” All other 
students ( including those majoring in science and 
not planning a career in either teaching or research) 
received a score of 0. 


3. Intention to obtain the PhD degree 
(scored dichotomously. ) 


4, Placed (first, second, orthird) ina state 
or regional science contest (scored dichotomously). 


5. Interest in ** making a theoretical contribu- 
tion to Science." ( Scored on a 4-point ( 4-1 )scale: 
Essential, Very Important, Somewhat Important, 
and Of Little or No Importance. ) 


Pretest (** Control") Variables 


Pretest or control variables included the selec- 
tion score and the five subtests from the NMSQT, 
in addition to the following eighty- five measures 
from the student questionnaires: high School grades, 
age, highest degree sought ( four dichotomies, ) fa- 
ther's education, mother's education, parental in- 


come, parental religion ( four dichotomies), re- 
gion of residence ( five dichotomies) , first and sec- 
ond choices of probable major fields in college (34 
dichotomies) , probable career choice ( 15 dichot- 
omies) and degree of interest in eighteen specific 
occupations ( scores on eighteen 5-point scales). 
With the exception of age, residence, and parental 
religion and income, which were taken from the A- 
CE questionnaire, all items were based on the stu- 
dents’ responses to the National Merit questionnaire 
administered when they took the NMSQ T. 


Statistical Analyses 


The principaltechnique used toassess the impact 
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of SSTP participation was multiple stepwise linear 
regression analysis, For these analyses a 5 per- 
cent sample of the 102,295 matched Ss (N=5, 114) 
was selected. Five separate stepwise analyses were 
performed, one for each evaluation criterion, The 
eighty-five pretest measures served as independent 
variables in each stepwise analysis. Each of the 
analyses was continued until no additional pretest 
variable was capable of producing a reduction in the 
residual sum of squares exceeding р = .05. 


The object of these regression analyses was to 
identify all pretest variables which affected the stu- 
dent’s scores on each evaluation criterion, regard- 
less of participation in the SSTP. Once the appro- 
priate weights for each of the pretest variables had 
been identified, they could be used to equate the vari- 
ous groups of participants and nonparticipants sta- 


tistically in terms of their characteristics as high 
school juniors, 


RESULTS AND DISCUSSION 


Table 6 summarizes the results of the five step- 


TABLE 6 


SUMMARY OF STEPWISE REGRESSION ANALYSES 
N=5,114 ) 


—————M—————— 
Number of Pretest Vari- 
Pretest Vari- ables Receiv- 
Evaluation ables entering ing Largest 
Criterion R Equation* Weights 
Intention to 


ы Initial major 
major ina 


aJor ir andcareer 
Scientific plans, interest 
field Scales, NMS - 
.567 32 QT math 
Intention to Initial major 
poran a and career 
reer in lans, interest 
Science — .630 32 malas 
Intention Initial degree 
to obtain plans, NMS- 
the selection 
PhD. .463 27 Sex (male) 
Placed in High school 
State or grades, NM- 
regional SQT Selection, 
Science Sex (male) 
contest .583 20 
ле Interest in 
is m = research, in- 
Seed Fm er maet 
tribution to ian 
science” ( male} lá 
51 28 бі 


itial degree 


* Cutoff point was p-.05 (F «4,00 ) me 


TABLE 7 


EFFECTS OF PARTICIPATION IN NSF STUDENT 
SCIENCE TRAINING PROGRAM ON PLANS TO 
MAJOR IN A SCIENTIFIC FIELD IN COLLEGE 


Percent of Students Planning to 


Major in a Science Field 


Expected Actual Difference 


Basedon From Actual - 
Spring Fall Expected 
Student 1966 1967 
Group N data data i 
Partici- 
pants 
White Boys 1,339 63.7 70.6 +6.9 
White Girls 534 49,9 61.6 +11.7 


Black Boys 68 67.5 64.3 -3.2 


Black Girls 77 50.5 45.5 -5.0 


Nonpartici- 

pants 

White Boys 3,501 41.0 41.2 40.2 | 
White Girls 2,983 21.8 28.8 м4 
BlackBoys 1,179 50.5 40.8 -9,7 

Black Girls 1,824 36.8 32.6 -4,2 


wise regression analyses, The multiple correla- 
tion coefficients indicate that the five evaluation 
criteria can be predicted over the eighteen month " 
interval with moderate accuracy, In general, pre ri 
test variables that received the largest weights WC 
those whose content most resembled the content 9 
the evaluation criteria being predicted, Thus, 1 
tial major field plans received the largest weights ы 
in predicting freshman choice of a major in collet 
Similarly, degree plans as expressed in the eleve' 
th grade received the largest weights in predictinf 
degree plans at the time of entrance to college. 


These regression 
mate the effects of SSTP 


assuming no effect of Pros rg 
expected scores are then oy 
5 actual performance to 46 
lated from expectation. nc 
ion analysis, these differe”? 
sion.) mean residuals from regrè 
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Table 7 shows the expected, actual, and dif- 
ference percentages for the first evaluation criteri- 
on ( intention to major ina scientific field in col- 
lege), separately for each of the eight groups. The 
data indicate that the percentages of white SSTP par- 
ticipants who plan to major in science when they en- 
ter college exceeds expectation for both sexes, These 
differences between the expected and actual percents, 
when compared with the small differences obtained 
in both groups of white nonparticipants, indicate that 
SSTP participation increases the white student's 
chances of selecting a science major when he enters 
college. 


Results with black students, however, are not 
во clear-cut. All four groups of black students— 
participants and nonparticipants alike—show lessin- 
terest in majoring in science when they enter college 
than would have been expected from their eleventh 
grade data, Apparently, there is an interaction be- 
tween race and changes in science interest over the 
18-month interval; that is, the science interests of 
black students show a somewhat greater decline than 
those of white students over the 18-month interval. 
With respect to the effects of SSTP participation, 
there is some indication that it retards this decline 
in science interests among black males (-3.2 per- 
cent decline) among participants versus -9.Трег- 


TABLE 8 


EFFECTS OF PARTICIPATION IN NSF STUDENT 
SCIENCE TRAINING PROGRAM ON PLANS TO 
PURSUE A CAREER IN SCIENCE 


M 


i ee 
Percent of Students Planning 


Careers as Scientists 


Expected Actual Difference 
Based on From Actual- 


Spring Fall Expected 
Student 1966 1967 
Group N data data 
Partici- 
pants 
White Boys 1,339 41.0 49.7 48.7 
White Girls 534 20.3 27.2 46.9 
Black Boys 68 40.1 41.1 41.0 
Black Girls 77 16.6 13.0 -3.6 
Nonpartici- 
pants 
White Boys 3,501 26.5 25.7 -0.8 
White Girls 2,983 2.6 2.7 40.1 
Black Boys 1,179 27.7 19.6 -8.1 
Black Girls 1,824 6.6 3.5 -3.1 


ee 


TABLE 9 


EFFECTS OF PARTICIPATION IN NSF STUDENT 
SCIENCE TRAINING PROGRAM ON PLANS TO 
OBTAIN THE PHD DEGREE 


Percent of Students Planning 
PHD Degrees 
Expected Actual Difference 


Based on From Actual- 
Spring Fall Expected 
Student 1966 1967 
Group N data data 
Partici- 
pants 
White Boys 1,339 41.8 55.5 +13.7 


White Girls 534 24.8 29.5 + 4,7 


Black Boys 68 46,7 52.9 + 6.2 
Black Girls 77 28.6 32.5 * 3.9 
Nonpartici- 

pants 

White Boys 3,501 21.5 21.5 .0 

White Girls 2,983 7.8 7.8 .0 

Black Boys 1,179 31.7 30.0 -1.7 


Black Girls 1,824 19.9 21.5 +1.6 
 ——————— 


cent decline among black male nonparticipants) , but 
there is no indication that it has any such impact 
among the black women, 


The effects of SSTP participation on plans to 
pursue a career in science are shown in Table 8. 
Here the pattern of effects is very similar to what we 
saw in Table 7. Among white students of both sexes, 
interest in pursuing a career in Science appears to 
be enhanced by SSTP participation, White nonparti- 
cipants show the expected difference scores of near 
zero, but black nonparticipants show а greater-than- 
expected decline in science interests during the 18- 
month interval. SSTP participation appears toim- 
pede this decline among the black male students, but 
not among the black females. 


Table 9 shows the effects of SSTP participation 
on plans to obtain a PhD degree. The greatest dif- 
ference between expected and actual percentages oc- 
curs among the white male participants, where the 
actual score is nearly 14 percentage points abovethe 
expected score, This finding indicates that SST Р 
participation has a pronounced positive effect on the 
plans of the white male high school student to go on 
to the PhDdegree. 


Effects of SSTP participation on PhD aspirations 
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among the other three groups of participants are not 
as marked, although the trends are in the positive 
direction. Once again, there appears to be very lit- 
tle effect of participation among the black female 
students. 


Table 10 shows the effects of SSTP participation 
on the student's expressed interest in “ maki ng 
a theoretical contribution to science."' The differ- 
ence scores of SSTP participants and nonparticipants 
indicates that the Program has a positive effect 
among white students, but only a borderline effect 
(not statistically Significant) among black students, 
These findings, however, should probably be inter- 
preted with more caution than the findings with the 
three previous criteria ( Tables 7-9), since the 
National Merit questionnaire did not actually inclu- 
de a “© pretest” on this particular item, Thus, the 
possibility that we have not adequately controlled 
relevant pretest differences between participantsand 
nonparticipants is greater on this outcome than on 
the three previous ones. 


It should be pointed out that the actual mean 
Scores of SSTP participants as shown in Table 10 


TABLE 10 


EFFECTS OF PARTICIPATION IN NSF STUDENT 
SCIENCE TRAINING PROGRAM ON STUDENT 
INTEREST IN “ MAKING A THEORETICAL 
CONTRIBUTION TO SCIENCE" 


Ee 
Mean Score on Item * 


cted Actual Difference 
Based on From ctual- 


Spring Fall Expected 
Student 1966 1967 
Group N data data 
EN x SS eee 
Partici- 
pants 


White Boys 1,339 2.31 2.50 30.29 
White Girls 534 1,91 2.15 +0.24 


Black Boys 68 2.42 2.51 -0.09 


Black Girls "т, 1:91 1.95 +0.04 
Nonpartici- 

pants 

White Boys 3,501 1.83 1.80 -0.03 
White Girls 2,983 1,94 1.35 +0.01 


Black Boys 1,179 2.03 1.90 -0.13 
Black Girls 1,824 1,60 1.54 -0.06 


* Scored ona 4-point scale; Essential (4 ); Ver 
Important (3); Somewhat Important à; ) ü 


or No Importance (1) MEL 


SCIENCE TRAINING PROGRAM ON WINNING AN 
AWARD IN A REGIONAL OR STATE SCIENCE 
CONTEST 


| 


Percent Winning an Award in 
Regional or State Science 
Contest 


Expected Actual Difference 


Based on From Actual- 
Spring Fall Expected 
Student 1966 1967 
Group N data data 
Partici- 
pants 


White Boys 1,339 6.9 17.3 +10,4 
White Girls 534 6.3 12.9 + 6.6 
Black Boys 68 10.5 23.5 +13.0 


TABLE 11 
EFFECTS OF PARTICIPATION IN NSF STUDENT 
Black Girls 71 9.2 20.8 +11.6 


Nonpartici- 

Pants 

White Boys 3,501 3.6 3.1 -0.5 

White Girls 2,983 2.6 2.5 -0.1 

Black Boys 1,179 7.4 7.4 0.0 

Black Girls 1, 824 6.8 6.9 30.1 
ee ee 
fall between 2 ( ** of some importance") and 3('*ver: 
important"). Considering the high proportion of 
Science majors among the SSTP participants (about 


two-thirds; see Table 7) , one might have expected 
higher scores on this scale. However, in view ofthe 
fact that less than half of these SSTP students actuallY 
planned a career in science ( see Table 8), it is 
perhaps to be expected that many of them will not 
attach much importance to theoretical work, 


4 


or state science contest—are shown in Table 11, Thi$ 
is the only one of the five criteria where the Progra” 


very near zero, One note of caution, however, sho 

be added in evaluating these findings: there is so me 
possibility that SSTP participation is for some stu- 
dents the result, rather than a causeof having re D. 
ceived an award in a science contest, Some student?! : 
for example, may have had their Science projects 

well underway by the time they were considered fot 
participation in SSTP, Such projects would, intur?; 
make these students more visibleto the persons 
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responsible for selecting SSTP participants, 
ALTERNATIVE ANALYSIS 


The evidence that SSTP participation increases 


the student's interests in majoring in a scientifi 
field ( Table 7) raises a further question: es 
the Program operate to affect interest in all sci- 
ence fields, or only in certain ones ? In order to 
explore this question, a somewhat different method 
of analysis was used. Instead of performing re- 
gression analyses separately for each of the speci- 
fic Science fields, we decided instead to perform 

a matching study, While matching as a quasi-ex- 
perimental technique is generally inferior to re- 
gression analysis ( 3 ), we chose matching be- 
cause it permitted us to obtain for each SSTP par- 
ticipant a matched nonparticipant whose eleventh- 
grade choices of fields of study and careers were 
exactly the same. 


The procedures for obtaining these matches 
were as follows. The data files for SSTP partici- 
pants (n = 2,018 ) and nonparticipants ( n = 100, 
271) were sorted separately in the following order: 
sex, initial career choice, initial major field 
choice, initial degree plans, high school grades, 
NMSQT scores, and interest scale scores, For 


TABLE 12 


PERCENTAGES OF STUDENTS CHOOSING 
VARIOUS FIELDS OF STUDY BEFORE AND AFTER 
PARTICIPATING IN NSF SUMMER PROGRAM 


(———————— 
м ———— 


NSF Partici- Matched Con- 
pants trols * 
Major Elev- Enter Elev- Enter 
Field enth Col- enth Col- 
Choice Grade lege Grade lege 
Physical 
Sciences 34.2 33.1 33.5 26.7 
Engineer- 
ing 16.3 17.2 16.6 16.6 
Biological 
Sciences 13.0 12.8 12.1 9.7 
Premedical 9.7 9.4 9.9 10.2 
Other 
Medical 3.8 — 3.8 3.5 3.4 
Social 
Science 1.9 5.0 1.7 5.2 
Undecided 7.5 1.5 7.8 1.8 
Other non- 
Science 13.6 17.2 14.9 26.4 


* Controls have been matched one-to-one оп the basis 
of sex, initial career choice, initial major field choice, 
initial degree plans, high school grades, aptitute test 
scores, andinitial interest scale scores, 
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each SSTP participant, a matched control S was 
selected from the file of 100, 277 participants. In 
those few cases where an exact match could not be 
found, the closest match was used. Matching cri- 
teria were relaxed in the reverse of the order shown 
above ( that is, initial interest scale scores were 
relaxed first) . 


Specific major field choices of the SSTP parti- 
cipants and matched controls are shown in Table 
12, both at the eleventh-grade level and alsoat the 
time they entered college, The columns of data 
under ** eleventh-grade'' indicate that the matching 
was very close. With the exception of ** other ” 
fields ( where the difference between participants 
and controls was 1.3 percent), no specific major 
field choice showed a difference as great as one per- 
cent, ( Of course, it probably would have been pos- 
sible to obtain exact matches in these distributions 
of eleventh grade choices if initial major field 
choice had been the first, rather thanthe third, 
matching criterion; see above) . 


The data in Table 12 indicate that the effects of 
SSTP participation on choice of a major field are 
limited to a major in either physical or biological 
Science. The decline in student interest in these 
fields among the matched controls ( -9.2 percent) 
is much greater than it is among the SSTP partici- 
pants ( -1,3 percent). This comparative net gain 
among the SSTP participants ( approximately 8 per- 
cent) is very close to the net gains shown previous- 
ly among white participants in the regression analy- 
sis ( Table 7). It is important to note that SSTP 
participation appears to have no effect on student 
interest in majoring in engineering, medical sci- 
ences, or social sciences, The increase in student 
interest in social science among the SSTP partici- 
pants, for example, is paralleled by an almost iden- 
tical increase among the matched controls, The 
relatively large increase in ** other" choices among 
the matched controls appears to be primarily the 
result of dropouts from the physical and biological 


Sciences. 


А final note of interest from this matching analy- 
sis is that the proportion of black students among 
the matched controls ( 1.8 percent) was even 
smallerthanthe percent among nonparticipants shown 
earlier in Table 1 (3.0 percent). Thus, when 55- 
TP participants are compared with nonparticipants 
with identical interests and abilities, the overrep- 
resentation of blacks among the participant group 
(7.5 percent) appears even larger. 


SUMMARY 


The purpose of this report was to evaluate the 
SSTP of the NSF by examining the characteristics 
of students selected into the program and by esti- 
mating some of the effects of the Program on the 
student's educational and vocational plans and 
achievements. Longitudinal data from a national 
sample of students who participated in both the 1966 
National Merit Testing Program and the ACE sur- 
vey of college freshmen in 1967 reveal the follow- 


ing: 


1. In terms of academic ability and academic 
achievement in high school, SST P participants 
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represent a distinctly superior group. Compared 
with other high school students, SSTP participants 
are also younger, more highly motivated for gradu- 
ate study, and more likely to be men. Their par- 
ents, compared with the parents of other high school 
students, tend to be more highly educated and afflu- 
ent, are less likely to be Roman Catholic, and are 
more likely either to be Jewish or to have no for- 
mal religion. 


2. The number of black students among SSTP 
participants is four times greater than the number 
that would be expected in any comparable group with 
similar interests and abilities. It seems likely that 
this overrepresentation of blacks is a direct con- 
sequence of the fact that the program attempts to 
select a portion of the students because their educa- 
tional opportunities have been limited. 


3. Among white students of both sexes, SSTP 
participation appears to have а positive effect on 
their interest in majoring in science in college, on 
their interest in pursuing a career asa scientist, 
and on their intention to get a PhD. Similar, but 
less Pronounced, program effects were observed 
among black male participants, Among the black 


female participants, however, no such effects were 
observed, 


4. The study also produced evidence that SSTP 
Participation increases the student’s chances of win- 
ning an award in astate or regional science contest 


while in high school. This finding obtained for black 
and white students of both sexes 
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ABSTRACT 


A Spanish translation of 
ezuela’s Central University. 
of the test: (1) The results of the 


of the item analysis showed that only four factors (С, 
an 


ts of views, verbal content and 


Beta). (3) A Reciprocal Averages Program was used to expl: | 
eighted values of the response alternatives, 


4.50). (2) The acceptable factors (C, H, F, Q 
were analyzed from two poin 


ability through changing the wi 


These results indicate that the correct g 
Factor Personality Test was not sufficient too 
in detecting faulty items. 


ONE OF THE crucial problems confronting 
education in most developing countries is the scar- 
city of appropriate instruments forlocating students 
on the various relevant subject-matter, personal- 
ity, attitudinal, aptitude, or interest continua, Be- 
cause many scientists in developed countries be- 
lieve that there is essentially no point in repeating 
instrument development already completed in their 
countries, there is a great temptation to recom - 
mend the translation of desired instruments and to 
utilize them with the scoring procedures andnorms 
of the country of their development. Secondly, 
many scientists in developing countries who feel 
they need tests of one form or another, but who 
have neither the time nor the resources to develop 
such an instrument, are tempted to use translated 
tests and norms as a solution to their problems. 


Sufficient data exist (2, 5, 3, 11) to suggest that 
though national groups are undoubtedly similar in 
their fundamental humanity, they are also suffi- 
ciently different to prohibt the use of identicalin- 
dicesforgroupand individual comparisons, Never- 
theless direct translations of tests are being used 


the Sixteen Personality Factors! test was administered to 524 freshmen at Ven- 
The responses were scored using the original scoring key provided by the editor 
test were submitted to an item analysis program (FORTAP). The results 
‚ F, Q,) hadacceptable reliability (Internal Consistency = 
а items within these factors susceptible to improvement 


statistical indices (biserial correlations, X50 and 
ore the possibility of improving the factor reli- 


rammatical translation from English to Spanish of the Sixteen 
btain acceptable reliability indexes. Item analyses were useful 


without adequate attention to the fact that cultural 
nuances might require a complete re-norming— 
perhaps even a complete re-factoring of any given 
test before it should be used in any of the various 
educative and mental-health guidance processes, 


This study tested the feasibility of using a Span- 
ish translation of Cattell’s Sixteen Personality Fac- 
tor Test (16 PF). For the purposes of this study, 
the basic factor structure described by Cattell was 
assumed to be valid; that is, no attempt was made 
to recreate the basic factor structure, Rather, the 
study focused upon the function of factors within the 
Багу and upon the function of items within these 

actors, 


Reliability coefficients were considered to be the 
most important statistical index in determing the 
acceptability of the factors since reliability coeffi- 
cients are pertinent to validity in the negative sense, 
that is unreliable factors cannot be valid (12). 
Hence, assuming Cattell’s 16 PF to be universally 
valid, if they are not reliable for the Venezulan 
population tested, it could be stated that the items 
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and factors do not measure the proposed traits in 
that context, 


rsonality tests, the literature reviewed 
eee Gaetan specific, acceptable values for 
reliability coefficients. In consequence, an 
a priori value of .50 was taken as the Smallest ac- 
ceptable value. The decision to use this value was 
determined on two bases: 1. inthe person- 
ality test area the reliabilities obtained are fre- 
quently lower than those obtained with aptitude and 
achievement tests (8), and 2. that the lowest re- 
liability value accepted by Cattell in the 16 PF was 
+54 for factor Q, (see Table 1). 


PROCEDURES 


A translation of the Form A, of the 1962 edition 
of the 16 PF translated in Chile by Naranjo (10) 
was adapted to Venezuelan Spanish. Changes were 
made in the colloquialism of expressions which vary 
in Spanish speaking countries, 


Subjects 


The test was given to a sample of 524 Ss com- 
posed of three hundred males and 224 females be- 
tween the ages of 18 and 25. These students were 
entering freshmen at the Medical School of Ven- 
ezuela’s Central University in Caracas, The test 
was presented to the students as an experimental 
questionnaire whose only purpose was to help them 
in the future if they needed some type of guidance, 


Responses were correct 
keys provided by the Ame 


; the intermediate response 1, and the 
absence of trait 0, 


The program 
was set up to run sixteen times at once, consider- 


endent test, This 
ram to give inde- 

tems included in a 
factor, 


The extent of mutual influence among the sixteen 
factors of the Spanish version Was computed using 
the Pearson Product Moment Formula.The compu- 
tational work was done with 2 computer program de- 
veloped by Wolfe (13). 


RESULTS 


The trait construct was accepted at face value, 
However, in the present analysis, factors which 


donot meet the minimum reliability criteria level 
(.50) were not analyzed. 


Table 1 shows a comparison between the factor 
reliability determined in Venezuela with the Spanish 
version and that obtained in the United States by 
Cattell. Note that for the Venezuelan sample only 


TABLE 1 


COMPARISON OF THE RELIABILITY 


COEFFICIENT OBTAINED IN VENEZUELA 
AND THE USA* 


Venezuelan American 
Factors Sample Sample 
A +28 .82 
B .20 275 
с .52 .89 
Е .45 .82 
F .62 419 
G .38 ‚14 
H ath .70 
І .35 .61 
L .13 .63 
M 4 .79 
N -.01 . 64 
o 37 74 
Q, -.06 .94 
Q .46 .64 
9; .38 461 
9, .56 „19 


*Тһе reliability coeffici 
ple were taken from t 


er-Richardson 
ch represents an average of all ров” 
es to obtain a reliability coefficient 


Formula 20 whi 
sible split halvi 
9 


four factors (C, F, H, andQ,) of theSixteen (4) 
have acceptable (.50) reliability indices. 
Content and st; 


atistical Analysis of the Acceptable 
Factor ee of Vite, Accepit 


(C, F, Н, Qj) wereiden- 

c , each was analyzed in furthe” | 
detail. To do this, item characteristics within fa 
tors were conside 


J c red and suggestions were ша 
for Improving illustrative items, 


Factor C. This factor measures dynamic i | 
gration and maturity as apposed to general emoti® 
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ality. It can be equatedto Eysenck’s “general neu- j ing this scale. The mean discrimination index for 
" roticism" (6). Usually the C person is easily an- the three possible choices are: 
noyed by things and people, is dissatisfied with the Mean Biserial 
world situation, his family, the restriction of life, Type of Choice Correlation 
and his own health. 
Presence of trait (maturity) .463 
i А In between -.176 
The item analysis (see Table 2) shows a Hoyt Кк сн trait (emotionally) Lao 


reliability of .523 for the thirteen items compris- 
These mean biserial correlation values are within 


TABLE 2 the accepted range for discrimination indices. 
DESCRIPTIVE STATISTICS FOR ITEMS COM- Looking to the percentages of answers given to 
h PRISING FACTOR C (HOYT r=.523) each choice (see Figure 1), a higher tendency сап 


be observed toward the use of ‘‘ presence of trait” 
choice (53 per cent) than to the ‘‘absence of trait" 


Item Percentage Ж choices (20 per cent) . 


Number Weight of Responses Biserial X50 Beta 
Analysis of an Item Susceptible to Improvement 


4 2 E Ж 51 261 кщ 40 Item 104 ‘‘When nr pele КАКОЕ I 
: -5 59 .. just: (a) keep quiet, (b) in between, (c) despise 
0 2 -.88  -5.52 -.41 enel 
5 0 j 19 E 20 m рт The Presence oftrait, choice “ʻa” hasa low Beta 
2 2 "1 .49 -1.10 .57 (.30) and a negative X50 (-1.47). This is proba- 
bly due to translation of the choices. “Кеер quiet” 
0 53 -.55 .16 -.66 was translated literally as **don't speak’’ (me callo) 
29 1 8 -.03 -48.69 -.03 and ‹‹аеѕріѕе” as ‘‘I disdain them’’ (105 desprecio), 
2 39 .58 .51 .12 This last option is very “‘strong’’ in the Venezuelan 
culture and might be forcing many people low in this 
2 1з .38 -1.63 .40 Scale to select choice “а,” 
30 1 20 -.25 -3.46 -.25 
0 T -.40 -3.64 -.43 Factor F. This factor is one of the most impor- 
tant in measuring extroversion and introversion. 
80 .39 -2.17 .42 Individuals with high scores іп Ғ are usually more 
p 55 1 11 -.21  -6.03 -.21 optimistic and have a more happy-go-lucky attitude 
0 9 -.42 -3.14 -.46 toward life. Individuals with low F scores tend to 
жоғын be more worried and depressed by common life 
0 20 -.59 -1.45 -.73 problems. The Hoyt reliability for factor F (see 
19 1 18 -.19 -4.81 -.20 Table 3) is .616. The distribution of individuals 
2 62 .57 -.58 .68 choosing between the three possible answers is well 
“ИНИН eee PRIN balanced 39 per cent, 25 per cent, and 36 per cent 
0 8 -.55 -2.61 -.63 (see Figure 1). The mean discrimination indices 
80 1 53 -.20 .34 -.20 are the following: 
2 39 41 .8T .45 Mean Biserial 
Type of Choice Correlation 
2 66 .29 -1.47 .30 керы 
104 1 24 -.12 -5.71 -.12 Presence of trait (extroverted) .50 
0 10 -.41 -3.28 -.44 In between -.43 
=- L В 5А Absence of trait (introverted) -.49 
105 : = us e e E Analysis of an Item Susceptible to Improvement 
" Р _ Item 83. “1 would hate to be where there 
‚в 1 Б ОЁ RETR | vatta ot pope to im te: Ua) aues W) 
2 63 61  -.96 .78 uncertain, (e) false.’ 
2 58 .40 -.48 .43 The part of the item “а lot” was left in the Ven- 
130 1 20 -.09 -9.43 -.09 ezuelan version, with the Chilean expression ‘ата, '* 
0 22 2:43 1. ТА dB а word seldom used in Venezuela with this meaning, 
but frequently used to indicate ‘being tired of some- 
0 6 -.52 -2.92 -.61 thing." This translation error probably explains the 
154 1 27 -.43 -1.38 -.48 poor statistical Beta value (.31) obtained for this 
2 67 .51 2.79 .10 item. Changing the word “‘harta’’ for “muchos” 
(many), should improve this item, 
у 2 39 .49 457.59 А — 
179 1 28 -.06 -10.19 -.06 Factor Н. H minus individuals tend to leptoso- 
0 33 244 -.92 -.53 matic characteristics (shy and restrained temper- 


ament). H plus personstend to be those individ- 
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FIGURE 1 


DISTRIBUTION OF THE RESPONSES ACCORDING TO CHOICES 


а 
ы 
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8 
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In between 


The Hoyt reliability coefficient 
factor is .709 (see Table 4). 
the answer in percentage value 
options, is well-balanced: 
percent, (c) 35 percent (s 
biserial correlation for the 


obtained for this 
This distribution of 
» between the three 
(a) 38 percent, (b) 27 
ee Figure 1). The mean 
choices of this factor 


are: 
Mean Biserial 
Type of Choice Correlation 
Seton | 
Presence of trait (adventurous) .953 
In between 7.034 
Absence of trait (shy) -.541 


Апа1увїз ої ап Нет Susceptible to Improvement 


Item 86. ‘I would rather have а job with: (a) a 


cate shyness, has an X50 of (.77) anda Beta of 
(-.37). This item undoubtedly hasa different con- 
i y not as competitive as the Amer- 


fied with their life, The Hoyt reliability coefficient 
for factor Q, is .56 (see Table 5). тһе distribu- 
tion of answers tend toward those measuring phleg- 


Percentages 


OzxctHuzommo» 


X 43 22 35 


Mean Biserial 


Type of Choice 


Correlation 

. Correlation, 
Presence of trait (excitable) .47 
In between .02 
Absence of trait (composed) -.48 


Analysis of an Item Susceptible to Improvement 


Item 149, «g tend to tre 
I think of a difficult task ahead: (a) generally, (b) 
occasionally, i 


example of an i 


ing to these extremes, 
view this item is i 

(X50 = 2.91, B = .45) 
tion of the test it 


think of all the things lying ahead of me, ” 


not on a priori Consideration but on the 
response records of the individuals, 


A reciprocal averages program (RAVE) devel- 
oped by Baker (1) and available as a library pro- 
gram at The University i 


ed. This Program does not take into account the 
Score weighted 0. In CO! 


ae 
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TABLE 3 


DESCRIPTIVE STATISTICS FOR ITEMS COM- 
PRISING FACTOR F (HOYT r=. 616) 


ee ы 
Item Percentage r 
Number Weight of Responses Biserial X50 Beta 
ee 
0 43 -.46 -.39 -.52 
8 1 18 -61 63.59 .01 
2 39 .46 .99 .52 
c i i 2275 890 
2 42 .52 -41 .61 
33 Е 30 -.04 -11.51 -.04 
0 28 -.56 -1.04 -.68 
2 26 .40 -.60 .43 
58 1 62 -.10 2.76 -.10 
0 12 -.43 -2.64 -.48 
0 26 -.57 -1.14 -.69 
82 1 21 -.17 -4.74 -.17 
2 53 -58 -.14 .72 
C 02 344 
2 31 .30 1.63 .31 
83 1 18 +04 21.43 .04 
0 51 -.29 .07 -.31 
0 37 -.57 — -.58 -.69 
107 1 17 -.06 -15.74 -.06 
2 46 .58 .19 ."1 
——————ÁMÉÓÓÓÓMÓ UA 
0 51 -.33 .09 -.36 
108 1 24 .10 6.80 .11 
2 25 .32 2.12 .34 
——————Á— БА 
2 31 .47 1.07 .53 
132 1 32 .05 9.84 .05 
0 38 -.48 -.65 -.55 
== ЕЕ ee Mic M orsa 
2 32 .53 .83 .63 
133 1 23 .03 24.73 .03 
0 45 -.51 -.29 -.60 
0 70 -. 43 1.21 -.46 
157 1 13 .15 7.42 .16 
2 17 ‚41 2.04 .53 
0 38 -.68 -.43 -.94 
158 1 14 -.07 -14.98 -.07 
2 48 10 -10 .97 
2 48 .65 .06 .86 
182 1 37 -.28  -1.29 -.29 
0 15 -.66  -1.54 -.88 
2 73 47 -1.32 .54 
183 1 17 -.31 -3.08 -.33 
0 10 -.45 -2.89 -.51 


Scoring system (2, 1, 0) was transformed to 3,2, 
1to obtain maximum efficiency from the program, 


A comparison of the reliabilities Obtained with 
the original weights andwith the new weighting sys- 


Table 6) with respect tothe reliabilities obtained 
with the original scoring. An inspection of the new 
weights shows that most of the changes in scoring 
were toward giving more weight to the «іп between” 
answers. There was no change in the weighting di- 
rection, i.e.; “the presence of trait” choices al- 
ways received the highest weight, 2 points in the 
original scoring and 3 points in the new scoring. 


TABLE 4 


DESCRIPTIVE STATISTICS FOR ITEMS COM- 
PRISING FACTOR H (HOYT r=.709) 


Item Percentage n 

Number Weight of Responses Biserial X50 Beta 
2 31 „70 173 .97 

10 1 49 -.16 -.03 -.16 

0 20 -.65 -1.31 -.86 

0 45 -.63 -.21 -.82 

35 1 16 -.02 -48.04 -.02 

2 39 .66 .40 .89 

2 59 -57 -.38 .70 

36 1 21 -.16 -5.10 -.16 

0 20 -.63  -1.32 -.81 

0 43 -.60 -.30 -.75 

60 1 27 .09 6.93 .09 

2 30 .60 .90 .74 

0 52 -.65 .06 -.87 

61 1 22 .22 3.51 .22 

2 26 -60 1.06 .76 

-_- ыы» 
0 31 -.68  -.71 -.92 

85 1 23 -.09  -8.13 -.09 

2 46 .68 .15 .92 
———————————— 
0 61 -.35 477 -.37 

86 1 11 -.03 -47.14 -.03 

2 28 .40 1.37 .44 
— ee et ME 
2 57 .48 -.37 .54 

110 1 20 7.08 -10.81 -.08 
0 23 -.55 -1.36 -.66 
Ee 
2 23 22 3.22 „23 

111 i 45 01 23.78 .01 
9 32 -20 -2.47 -.20 
135 4 46 -.06 -1.45 -.06 


2 36 .52 10 1 

136 1 28 -.00 -202.90 .01 
0 36 -92  -.69 -.60 
ME CE c LC NEL A 
0 52 -.51 ‚06 -.59 

161 1 18 10 9.45 .10 
2 30 50 1.01 .58 
2 55 510-293 53 
186 1 26 -.25 -2.55 -.97 
0 19 -.44 -1.98 -.43 
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i feasi- 
se results demonstrate that it was not 
Mec sess the Spanish version of the 16 PF by 
changing the original scoring system. 


Inter- Factor Correlations 


i tions which re- 
he inter-factor Pearson соггеја! i À 
E from the Spanish version are given in Table 


TABLE 5 


DESCRIPTIVE STATISTICS FOR ITEMS COM- 
PRISING FACTOR Q4 (HOYT r=.561) 
TM —MM——————— 


Item Percentage 
Number Weight of Responses Biserial X50 Beta 
0 52 -.44 -13 -.50 
25 1 11 21 5.86 .22 
2 37 36 .94 .39 
2 35 ‚52 .72 .62 
49 1 22 -.01 -72.30 -.01 
0 43 -.49 -.38 -.56 
2 48 .94 11 .63 
50 1 19 -.08 -10.79 -.08 
0 33 -.53 -.84 -.62 
2 43 51 .32 .60 
74 1 9 -02 61.79 .02 
0 48 -.52 -.12 -.60 
SS VON ERR 
0 47 -.31 -.28 -.33 
15 1 29 .02 33.07 .02 
2 24 .37 1.84 .40 
me ——Ááá ==. = 
2 32 .59 .78 .73 
99 1 24 .08 9.29 .08 
0 44 -.60 -.27 -.74 
0 "9 -.46 1.75 -.52 
100 1 10 -30 4.18 „82 
2 11 42 2.92 .47 
2 20 .59 1.43 .13 
124 1 21 .29 2.66 .31 
0 59 -.64 +32 -.84 
тты сыны жы осы ы. 
0 53 -.41 .20 -.45 
125 1 26 .10 6.17 .11 
2 21 .45 1.80 .51 
ILC E MEER dM c ее 
2 11 ‚41 2.91 .45 
149 1 38 .20 1.55 .21 
0 51 -.39 +05 -.43 
Cre DT o О 
0 41 -.36 -.62 -.38 
150 1 19 7.08 -11.15 -.08 
2 40 .41 -63 .46 
2 45 .55 .25 .65 
174 1 23 -04 17.83 .04 
0 32 -.64 -.72 -.82 
0 48 7.39 -.11 -.43 
175 1 21 15 5.60 .15 
2 31 33 


TABLE 6 


WITH 
HOYT RELIABILITY INDICES OBTAINED 

THE ORIGINAL SCORING SYSTEM AND WITH те 
NEW WEIGHTS DETERMINED BY THE RECIPR 
CAL AVERAGES METHOD 


Reliabilities Reliabilities 
From For the New 
Factors Original Scoring Scoring 

A .28 .29 
в 220 220 
с .52 .53 
Е .45 .46 
F .62 . 63 
G .38 .42 
H 271 „12 
1 „35 .38 
L .13 .23 
M -14 .18 
N -.01 -.01 
o .37 .41 
9 -.06 -.06 
9; 246 .46 
9; .38 38 
9% 56 59 


7. In Table 8 are shown the inter-factor correla- 
tions obtained by Cattell (3). Comparison of thé 
inter-factor correlations obtained in America an 
in Venezuela shows Several differences: 


1. In the America: 


n correlation matrix 
(N = 408) , 30 


} Percent of the interfactor 
correlations are Significant at the .01 
level. In the Venezuelan matrix (N -524): 


72 percent of the Correlations are signif- 
icant at the Same level, 


2. Twenty-five 
Correlation 
Sign to the 


Percent of the Venezuelan 
indices are in the opposite 
Опе obtained in America. 


3. Inthe American Correlation matrix the 
highest correlation value is.30. In the 
Venezuelan matrix, twelve of the согге- 
Hits have absolute values higher than 


ish version, 


4 
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TABLE 7 


INTER-FACTOR CORRELATION MATRIX OF THE SPANISH VERSION OF THE SIXTEEN PERSONALITY 


FACTOR TEST 


Factor A B a E F G H I L мон о о 9 9; 0, 

А 1.00 

В -.09 1.00 

с 03 .09 1.00 

Е -.01 10 .13 1.00 

F 16 -.01 17 42 1.00 

G 13 -.10 20 -.21 -.07 1.00 

H .22 -.06 42 41 .48 04 1.00 

I .19 -.04 -.20 -.19 -.21 .03 -.11 1.00 

L -.01 -.04 -.16 .10 .06 -.07 -.02 -.02 1.00 

М -.01 -.01 -.21 -.13 -.22 -.07 -.17 .31 -09 1.00 

N .03 .06 .05 -.07 -.13 .08 -.05 .06 -.06 05 1.00 

о 06 -.09 -.43 -.11 -.04 -.10 -.31 .08 .18 .10 -.10 1.00 

Qy -.13 .06 15 01 -.02 -.08 .06 .07 -.04 .00 .09 -.20 1.00 

Qg -.23 .01 -.06 -.23 -.43 .01 -.32 .08 -.02 .10 10 -.02 .11 1.00 

Q3 05 .01 28 -.25 -.17 .26 .05 -.01 -.18 -.10 .14 -.28 .06 .03 1.00 

Q4 00 01 -.53 .04 -.07 -.21 -.32 .13 .26 21 -.05 .45 -.16 .03 -.35 1.00 
CONCLUSION tor relationship employed by Cattell to establish the 


В The results reported herein must be evaluated 
with the fact that they represent the first attempt 
made in Venezuela to validate the 16 PF. 


The reliability indices obtained show that only 
four factors (C, F, H, Q,) have an acceptable inter- 
nal consistency and in consequence might also have 
acceptable validity, ери 


The distribution of the answers shows a general 
imbalance in favor of the‘‘presence of trait ’’ choices 
(43 percent) over the “аһвепсе of trait’? choices 
(35 percent) . 


The discrimination indices indicate that 38 рег- 
cent of the items forming the acceptable factors 
made no substantive contribution to the personality 
trait they are intended to measure. The same is 
true for 71 percent of the items forming the nonac- 


ceptable factors. 


А comparison of the inter-factor correlations 
matrices obtained in America and Venezuela showed 
substantive differences. This indicated that the fac- 


second order factors may not be valid in Venezuela. 


It was demonstrated with a reciprocal averages 
analysis that it is not possible to improve the sta- 
tistical indices by employing a different weight- 
ing system in the scoring of test results, 


One problem of this Spanish version was based on 
the fact that all the items were written taking as a 
prime consideration the grammatical similarity of 
the translated words and sentences, instead of try- 
ing to culturally adapt the construct behind ће orig- 
inal American items. Another problem was the 
quality of the 1962 edition, according to a letter re- 
ceived from the editors (Institute for Personality 
and Ability Testing) of the test, the new 1968 Amer- 
ican edition is superior tothe edition translated in- 
to Spanish. Upon review it was found that sixty - 
nine items (39 percent) were different in content 
in the 1968 edition from the 1962 edition, 


The statistical results obtained in the present 
study demonstrate that this Spanish version of the 
16 PF doesnot meetthe minimum requirements for 
reliability and validity established by the American 
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Psychological Association in its **Standards for Ed- 
ucational and Psychological Test and Manuals’? 
(12) ; therefore it can not be recommended for the 
diagnosis, prognosis, or evaluation of personality 
with the group from whom the data were collected. 
Further research is necessary to ascertain the ex- 
ternal generalizability of this inference to Venezu- 
ela or Latin America in general. 


In adapting tests from different cultures it is 
reasonable to suggest that attention must be given 
to the translation of individual items, 


item responses, 
systematically va: 
inal item in relat: 


FOOTNOTES 


1. Institute for Personali: 


ty and Ability Testing, 
Champagne, Illinois, 


2. X50 is the point in the criterion scale at which 
the item choice has maximum discrimination, 


TABLE 8 


INTER-FACTO 

FACTOR 

EOD OH FE qq ж I 
А 1.00 

CM «13° 3:00 

E .07 .06 1.00 

F -.01 -05 .00 1.00 

G  .00 .04 -.02 .05 1.00 

Н -.36 -12 -18 -.26 .00 100 
Dc 204 эө i .00 34 зор 
LO 017-01 2108 241 224 2% -.04 
МІР 51097 214: 406 38 50% 205 sfa 
ANLASS OP 2211-4025 2342 мор ugg 
02 549% оу —ni- Шы За -di 
Б ds di 24% aq 208 .02 
Q2 -07 -01 .i8 2% .9p 49-1 
ӨЗ 350 518 212 29 219 -29 ..99 
ОР 117029 ое уе 


1.00 
-. 07 
-.05 
-.98 

01 
-.01 
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i.e., it is the point on the criterion scale, 
given in standard deviation units, correspond- 
ing to the median of the item characteristic. 
curve of that choice. Subjects with a criterion 
Score equal to X50 have a 50-50 chance of 
choosing that response. 


3. Beta is the reciprocal of the standard deviation 
of the item characteristic curve and can be 
thought of as the slope of the item character- 
istic curve at the X50 point. It gives the dis- 


crimination power of the item in values that go 
from + infinity. 


It is interesting to note that the two factors 
(N, Q,) with the lowest reliability (-.01, -.06) 


are, according to Cattell, acquired social 
traits, 
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Book Reviews 
(continued from page 12) 
y which information is given concerning | 


of the six Mental Measurements Yearbooks, A coding system is used b; 
such basic information as the name of the publisher, for what levels the tests were designed, the organization 


of the test, and where additional information can be found. 


The heart of the book is the set of complete reviews from the six Mental Measurements Yearbooks given 
in chronological sequence, It puts into one volume the information that otherwise would have to be searched out 
in six books extending back to 1938. 


As an additional useful tool, there is a chapter that presents a classified listing of all tests appearing in 
one or more of the six Mental Measurements Yearbooks, These tests cover the entire field of objective measure- 


ment from ‘‘Achievement Batteries” to “Specific Vocations, ” 
Lastly, for ready reference, there are a Publishers Directory, an Index of Test Titles, and an Index of 
Names of Authors, Test Reviews, and Others Mentioned in the Volume. 


Reading Tests and Reviews has been skillfully designed to provide an efficient tool for educators who wish 
to be intelligently informed about available measures. The book is a delight to use and takes its place along side 


of Webster’s Unabridged Dictionary as a basic reference for the professional. 


Alfred S. Lewerenz, Reviewer 
Educational Consultant 
Hollywood, California 


REACH, TOUCH AND TEACH > OVT or y 


Borton, Terry, (New York: McGraw-Hill, 1970), 213 pp. $4.95. 


THERE APPEARS to be the emergence of a new paradigm in education, The past and présent failure of ° ден 
і ot 80 449% 
wae 


the schools to educate a substantial number of the young is bringing forth not only devastating critiques} but al: 
some exciting proposals for the future. One such contribution, Reach, Touch and Teach is outstanding insofa E? 
as it not only provides an image of a future paradigm for education but reveals glimpses of the transformations 
through which schools and individual teachers must go to implement the new paradigm. For those whose goalis 


process education, Terry Borton has presented a useful guide. 


In the new paradigm, the goal of education is the learning of those processes and skills through which an 
individual may change himself, In part, Borton notes, this goal is similar to the classical educational objective 
of training habits of mind; according to that model, the student would develop the intellectual skills to discover 

oa тыла Е X (continued on page 41) 
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THE RELIABILITY AND VALIDITY OF 


QUICK TESTS WITH HIGH SCHOOL SENIORS 


GEORGE W. BOHRNSTEDT, 
University of Minnesota 


PHILIP LAMBERT, and EDGAR F. BORGATTA 
The University of Wisconsin 


ABSTRACT 


The Quick Word Test (QWT), Quick Number T. 
cal tests were related with the English and Math 


est (QNT), and a number of criterion verbal and numeri- 


grade point average (GPA) scores in this study. The QWT, in 
general, had lower correlations with English GP. 


QNT and the Math GPA was approximately at the 


IN A PRIOR report, Borgatta and Bohrnstedt 
(2) indicated the utility of a “ Quick” test in assess- 
ing the performance of college age students. There- 
port suggested that the correlations of the ** Quick 
test with GPA was essentially 


of the same magnitude 
as those with standard college assessment tests. 
Since both the Quick Word Test (QWT) (3)and the 


Quick Number Test (QNT) (5) were designed to be 
used with the range of intelligence found in a normal 
adult population, they should be well-suited for use 
with students in the, high school range. From a prac- 
tical point of view, the ** Quick" tests might be use- 
ful in localities where more conventional testingto 
be used in advising students may be more difficult to 
administer. Thus, the use of tests,rather than the 
misuse of tests, has been under attack and conven- 
tional testing programs have been cut out of budgets. 
Additionally, in some locations tests are administer- 
ed on a self-selective basis only to those who ask to 
betested. This may exclude some persons who choose 
not to be tested for economic reasons, but motiva- 
tional and aspirational factors may be very important 
in the self selection. Obviously, persons never test- 
ed cannot be advised, and thus many persons’ high 
potential may be neglected if additional supporting in- 
formation is not in their files to confirm high school 
performance, or to lead to examination for potential 
in cases where high test scores are not accompanied 
by high grades. Having some tests uniformly adminis- 
tered in senior high school can be justified by the addi- 
tional information it puts into the student’s dossier. The 
Special interest in “‹ Quick” tests of the typereported 
in this research is that they can make testing 
feasible in difficult Situations, and less difficult 


circumstances, can add to the base on which ad- 
vice is offered to students. 


А scores than the criterion tests. The correlations between the 
Same level as the criterion measures. 


In the current research, cooperation was pro- 
vided by a school system in a southwestern city in 
which several types of standard tests commonly used 
for college advisement were available. Тһе“ Quick 
tests were administered in three schools, providing 
1, 186 participants for the study. The tests selected 
as relevant for comparisons with the ** Quick ” tests 
were the American College Test (ACT) (1) English 
and Math scores, the National Merit Scholarships 
(NMS) (6) English Usage and Math Usage scores, 
and the Collega Entrance Examination Board (CEEB) 
(4) Verbal and Math Scores. Different (butoverlap- 


ping)subgroups of the total had participated in each 
of these testings, providing s 


ubsamples of 712, 373 
and 190 respectively. The QWT was administered 
with a time limit of 15 minutes, although this is nor- 
mally a power test. The time restriction was not felt to 
materially alter the perform 


ances of students as at this 
level virtually all complete the form in 15 minutes. 


The most desirable comparison that could be 
made would be in subsequent performance of the 
participants. However, aside from not having these 
longitudinal data available, such data necessarily in- 
clude only the Segment of students that go on for 
higher education, The best predictor of future per- 
formance is commonly identified as ** current" per- 
formance; thus, in lieu of a projection intothe future, 
the criteriaused in this study are the cumulative GP- 
A’s in English and Math available for the students. 


RESULTS 


Table 1 presents the intercorrelation between 
the ACT Engl 


ish, ACT Math, QWT, QNT, English 
GPA, and Math GPA for a subsample of 712 students. 


4 
1 


Р 
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TABLE 1 


INTERCORRELATION BETWEEN ACT ENGLISH, 
ACT MATH, QWT, QNT, AND GPA (N=712) 


la 2a 3 4 5 6 


la. ACT English .48.62 .36 .68.50 
2a. ACT Math 7702.39 .74 .49.68 
3. QWT 238 .54 .38 
4. QNT 7 a wil got 
5. English GPA — -65 
6. Math GPA 


The ACT English correlates with the English GPA 
with a coefficient of . 68, while the QWT cor- 
relation coefficient with the English GPA is only . 54. 
While a correlation of . 54 between a verbal test and 
aGPA in English is substantial, the value of . 68 for the 
ACT English score is impressively large in this sam- 
ple. The correlation coefficients for the ACT Mathand 
QNT withthe Math GPA respectively . 68 and. 63, both 
relatively high values. 


Table 2 presents the correlation coefficients for 
National Merit Scholarship English Usage and the QW- 
T with the English GPA, which are respectively . 62 
апа. 45. The correlation coefficients for the NMS 
Math Usage and the QNT with Math GPA are respec- 
tively . 68 апа. 62. Thus, ће“ ціск” tests are per- 
forming approximately in the same way in compari- 
son to the NMS exams as in comparison to the ACT, 
but in the subsample of the 373 students for which the 
NMS was available, the relationships between the 
verbal scores and the criterion are somewhat lower. 


In Table 3 the data are presented for the subsample 
of 190 students for whom CEEB scores were available. 
The data indicate the same pattern, withthe CEEB ver- 
baland the QWT correlated to the English GPA with co- 
efficients of .64and. 41 respectively. The CEEB Math 
and QNT correlated with the Math GPA with coefficients 
of . 66 and. 62 respectively. 


DISCUSSION 


___ The data presented in this analysis are initially 
disappointing for the QWT. Possibly,a problem 
arises in the fact that the Ss of these analyses are, 
self-selectively, the high performing students inthe 


TABLE 2 


INTERCORRELATION BETWEEN NMS ENGLISH 
USAGE, NMS MATH USAGE, QWT, QNT, AND 
GPA (N-373) 


lb 2b 3 4 5 6 


lb. NMS English 


Usage .46 .47 38 62 53 
2b. NMS Math 

Usage 26 .69 .43 68 
3. QWT ~ 97 .45 .32 
4. QNT — 1 зв .62 
5. English 

GPA _ -69 
6, Math кен 


а --- 


high school, and thus the relationship between the 
tests of abilities and the criteria are depressed by 
the reduced individual variability. Still, the ques- 
tion of why the QWT should by noticeably lower than 
the other verbal tests is not answered by this general re- 
striction. The average score for the QWT is high (74 
out of a possible 100 for the CEEB subsample, for ex- 
ample), and this suggests it might be advisable with such 
groups to use the High Difficulty Form of theQWT. 


The “ Quick” tests in this application perform- 
ed at a level that may be considered reasonable but 
not outstanding. Other reports have been more fa- 
vorable, and possibly this more modest performance 
emphasizes a more universalistic point, namely giv- 
en the objective of providing more information touse 
as a basis for advising students, attitional efficient 
assessment instruments are required. The princi- 
ple of efficiency can be applied in test development 
to shorten the time required. Possibly as more ef- 
ficient tests of ability and performance are develop- 
ed, they can be administered more routinely ав“ one 
more piece of information" in adossier for students. 
In general, a major problem that confronts the ed- 
ucational testers is that insufficient general testing 
occurs because accumulation of such information re- 
quires a formidable investment of time andresources. 
Тһе“ Quick” tests or other tests developed with 
equivalent strategies may facilitate a more mundane 
view to the collection of such information, 


TABLE 3 


INTERCORRELATION BETWEEN CEEB VERBAL, 
CEEB MATH, QWT,QNT, AND GPA (N=190) 


1с 2c 3 4 5 6 
1.54 ,36 .64 .48 


lc. CEEBVerbal 


2c. CEEB Math _ -35 .68 .50 .66 

3. QWT cz 29. 441 4.38 

4. QNT Ж Т 62, 

5. EnglishGPA C» #65 

6. MathGPA E 
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BLACK PUPILS CAN BE TAUGHT TO LISTEN 


PERRY R. CHILDERS 
The University of Wisconsin-Milwaukee 


ABSTRACT 


THERE IS an increasing awareness of the im- 
portance of listening ability in relation to pupil 
achievement. Investigations by Russell (6), Witty 
(8), and Duker (2) offer evidence onthe importance 
of critical listening in reading and in written com- 
munications. Researches indicate critical listening 
is an identifiable factor, separate from general verbal 
intelligence, vocabulary, and reading ability (4), 
and that such an ability is a modifiable skill (1). 


Lundsteen (5) has demonstrated that given appro- 
priate instruction and materials, the elementary pu- 
pil has considerable capacity for improved critical 
listening. Winter (7) and Childers (1) found that 
there is significant improvement in listening ability 
from fourth through sixth grade, and that listening 
ability becomes less a function of measured intelli- 
gence through the sixth grade. Fawcett (3) cor- 
roborates these findings. Young (9) found a rela- 
tionship between reading comprehension and hearing 
comprehension and concluded that much reading dis- 
ability resulted from poor language comprehension. 


While the previously cited studies offer strong 
evidence of what could be done in individual schools 
to improve class performance and measured achieve- 
ment, they cannot be generalized to certain ethnic/ 
racial groups. Secondly, lower Socioeconomic lev- 
els were not adequately represented. The experi- 
ment reported herein was conducted to provide data 
on culturally disadvantaged black pupils in a large, 


urban school system. 
METHOD 


Sixty-four black pupils, matched on measured in- 


8.44 points, significant beyond .01 for the experimental group. The con- 
It was concluded that the type S used in the study could benefit substan- 


telligence and reading level, were randomly assigned 
to an experimental group and a control group. All 
Students were in seventh grade. 


The STEP Listening Comprehension Test, Form 
ЗА, was used to pretest. A profile chart for each 
student in the experimental group was formed show- 
ing apparent strengths and weaknesses. The stu- 
dents were then given special instruction and train- 
ing in critical listening with an emphasis on areas 
needing improvement. The program centered on con- 
tent areas designed to develop such skills as compre- 
hension, interpretation, and critical evaluation, The 
types of listening activities included 
Position, direction, simple explanati: 


and persuasion. The experimental group received 


no training as they 


: 1 At the end of 2 
weeks of instruction, both groups were administered 
Р STEP Liste ning Comprehension Test, Form 


RESULTS 


The mean IQ for the experimental group was 
88.84, the control group 91.31 (California Test of 
Mental Maturity, Form L). The mean reading score 
for the experimental group was 54.34, the control 
Eroup 55.28 (Iowa Test of Basic Skills, Form 1). 
The grade level for the two groups combined was 5.5. 


Equivalence between the experimental and control 
groups was assured. 


The pretest listening comprehension results for 


> 


i£ т 
Уа 
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TABLE 1 


COMPARISON OF EXPERIMENTAL AND CON- 
TROL GROUPS ON LEVELS OF INTELLIGENCE, 
READING, AND AUDING SKILL 


Experimental Control 
x La difference 
IQ 88.84 91.31 2.47 
Reading 54.34 55.28 94 
STEP (A) 35.18 38.09 2.91 
STEP (B) 43.62 38.18 5.44** 
difference 8.44** .09 


**Pt <.01 df-62 


,the experimental group was 35.18, and for the con- 
trol group 38.09. This difference was not signifi- 
cant. The posttest listening comprehension mean 
Score for the experimental group was 43.62, andfor 
the control group 38.18. This difference of 5.44 was 
significant beyond the .01 level. The difference of 
8.44 points between pre- and posttest results for the 
experimental group was significant beyond the .01 
level. There was no difference between pre- and 
posttest results for the control group. 


CONCLUSION 


The results presented support the conclusion that 
black, elementary schoolpupils, designated as eco- 
nomically and educationally disadvantaged, have the 
capacity for significant improvement in critical lis- 
tening ability (auding). Such pupils as those used 
as Ss in this study are characteristicallyone or more 
grade levels behind in reading. Their measured in- 
telligence test scores (IQ's) are also below the lev- 
el usually found associated with suburban Ss and test 
booklet norms. The extent of reading retardationis 
approximately equal to the percentage of retardation 
in measured intelligence. 


Paper and pencil intelligence test scores (IQ's) 
are a function of reading ability ( given). Reading 
ability is a function of critical listening skills ( giv- 
en). Critical listening skills can be significantly 
improved in black elementary pupils (demonstrat- 
ed). Therefore... improved reading achievement 
and/or improved scores on paper and pencil tests of 
intelligence in conjunction with a program of system- 
atic instruction in listening skills development ар- 
pears as a logical next step in this line of research. 
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ANALYSIS OF VARIANCE AND LATIN SQUARE 


PROBLEMS BY MULTIPLE REGRESSION ANALYSIS 


LEVERNE S. COLLET and JAMES H. MAXEY 
The University of Michigan 


ABSTRACT 


The purpose of this paper is to provide concrete illustrations of the efficacy of the multiple regression я 
approach to the analysis of experimental results. А bridge between theoretical and specific applications is Ист, 
vided by parallel multiple regression andanalysis of variance solutions for two typical educational designs. Ое 
tailed illustrations are given in the techniques of writing linear constraints and determining degrees of freedom. 
The major advantages of the multiple regression approach are its adaptability to unusual designs and the facil- 
itation of meaningful interpretation afforded by the provision of regression weights in addition to the usual 


Fratios. 


BOTTENBERG AND WARD ( 1 ) and Cohen (2 ) 
describe multiple regression ( MR ) as a very pow- 
erful and flexible tool which deserves much wider 
use among researchers. They point out the MR can 
be used to analyze any data for which analysis of 
variance is appropriate. The general equivalence 
of the two methods is illustrated in Figure 1. 


FIGURE 1 


PARTITIONING SUMS OF SQUARED DEVIATIONS 
ABOUT THE GRAND MEAN 


Error Treatment Be- 
Total WithinGroups tweenGroups 
ANOVAZZ(Y ,-Y. . )= ZZ; -Y. jT + пу. Y 


мв 5®(үү-..)=(1-в°у zz Qr. tnc 


Ап advantage of MR is that its solution routines 
are unaffected by unequal N's or incomplete block 
designs—both of which are beyond the capacity of 
many ANOVA programs. However, a search of the 
literature revealed a dearth of detailed MR solutions 
for practical research problems. It is the intent of 
the authors to bridge some of the gaps between theory 
and specific applications by providing comparative 


ANOVA and MR solutions to two typical problems. 
A simple 2-factor design and a 3x3 latin square are 


discussed. A wide variety of designs can be solved 
by simple extension of the illustrations given. 


BACKGROUND MAT ERIAL 


The key to understanding the MR solution comes 
in the recognition that membership in various treat- 
ment groups can be represented as dichotomized 
variables, Consider just two groups С, and G ,. 


Membership or nonmembership in these two gróups 
is illustrated in Table 1. 


Notice that Ss 1 and 2 belong to group 1 and 55 
З and 4 belong to group 2. This linear constraint 
setup for MR is equivalent to saying that Ss 1 and 
2 belong to treatment A and Ss 3 and 4 belong to 
treatment B in an ANOVA problem. 


i The formula for the F test using the MR solutio” 
s: 
(R? -Rè ) 
Ps AB YA a 
(35g ) /àt 
Y.AB 2 


This formula tests any i 2y A 
$ y increment to R“ Y. ^: 
due to the addition of B. The dfi for the numerate 
S the number of linearly independent vectors in* 
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TABLE 1 
MEMBERSHIP OR NONMEMBERSHIP 


Subject G, G 
2 

Т 1 0 

2 1 0 

3 0 1 

4 0 1 


full model minus the number of linearly indepen - 
dent vectors in the restricted model. The d£ for 
the denominator is N minus the number of linearly 
independent vectors in the full model. Due to the 
calculation algorithm the identity vector is always 
automatically included and should be counted as one 
of the vectors for both the full and restricted mod- 
els. It is important to to remember that the df 
are exactly the same as in a traditional analysis 
using the conventional F tests. 


The three main difficulties in using MR solu- 
tions are: 


(1) Learning to write the proper linear con- 
Straints for a given problem so that they re- 
present all the different combinations of 
group membership and interactions. 

(2) Learning to choose the multiple correlations 
which test the specified hypothesis, andde- 
termining the degrees of freedom associ- 
ated with each comparison. 

(3) Learning to use a computer program to an- 
alyze data. The program used in this study 
was Veldman's (3) Program Regran. The 
MR or General Linear Hypothesis programs 
from the Biomedical series are other possi- 
ble choices. 


Studying the following sample problems, which il- 
lustrate these three difficulties, offers a method 
whereby they can be recognized and overcome. 


EXAMPLE PROBLEM: 2-WAY ANOVA 


A numerical example of a 2x2 factorial exper- 
iment having ten observations per cell will be used 
to illustrate the computational procedures. Sup- 
pose that an experimenter is interested in evalua- 
ting how two methods of teaching ( factor A) affect 
changes in achievement in two categories, boys, 
and girls (factor B), The dependent variable is 
an achievement test using gain scores, The forty 
Ss have been randomly assigned to one of the four 
groups, 


The analysis was done by using Veldman’s (3) 
AVAR23 program. The notation used, ( Table 2) 


TABLE 2 
NOTATION TABLE 


Boys B, Girls B; 
Method A, 1 с, 


С 
Method A, Gs G 


TABLE 3 


OBSERVED SCORES 


С, (А,,В,) Gz (A1, B2) Gs (А,,В,) G4 (Az, By) 


23 30 42 31 
31 60 31 13 
42 57 45 18 
23 27 52 22 
54 38 28 23 
72 62 21 41 
81 47 17 37 
93 43 31 18 
72 37 18 17 
67 48 12 16 


observed scores, (Table 3), summary table of 
means, ( Table 4 ) and two-way Analysis of Variance 
Source Table ( Table 5) follow. 


Based on these results one can conclude that 
there was a significant difference between teaching 
methods at the . 0001 level and that there was no 
significant difference between sexes and no signifi- 
cant interaction if a ‚05 alpha level is used by the 
researcher. 


TWO-WAY ANALYSIS USING MR 


The same problem will now be analyzed using 
the MR technique. In setting up the linear con- 
Straints there are several things to consider, either 
membership or nonmembership in the various fac- 
tors and the criteria. А S is assigned а 1 if he is 
a member of a certain level and a -1 if he is not a 
member of that level, This gives the set of linear 
constraints shown in Table 6. 


Since in the presence of the identity vector the 
scores for A, are completely determined by A, and 
B; determined by B,, it is important to eliminate 
this redundancy in order to have a non-singular ma- 
trix, Variable X, represents factor A and variable 
X, represents factor B. Variable X, represents 
the AB interaction term and is generated from the 
product of X,. X,. Variable X, isthe criteria 
Score. If one desires, 0’s may be substituted for 
the -1’s, Table 7 shows how members of each of 
the four groups are coded and is the raw data that 
Eoes on the computer cards. 


To compute the F ratio for the main effect of A, 
variables X,, X;, X, were used to predict X,, then 


TABLE 4 


AB SUMMARYTABLE OF CELL AND MARGINAL 
MEANS 


b, b, 
а, 55.8 44,9 50, 35= Ay 
аҙ 28,7 23,6 26, 65= А, 
42,15 34,25 
B, B, 
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TABLE 5 


TWO WAY ANALYSIS OF VARIANCE SOURCE TABLE 


SOURCE 5.5. D.F. M.S. F-Ratio P 
Between 6397. 0023 3 2132, 3341 

A 5616. 9020 1 5616. 9020 21. 6058 . 0001 
B 722, 5003 1 122, 5003 2.1191 . 1005 
АВ 57. 6000 1 57. 6000 . 2216 . 6456 
Within 9359 36 259. 9722 

Total 15756. 0023 39 404, 0001 


the multiple correlation of R?X, . X,X, was com- 
puted. These two values were then substituted into 
the formula given earlier. A summary of the mul- 
tiple correlations and the F values follows: 

E = R? X,.X, X,X,- .4060 

R’, =R? X, . X, X, = . 4023 

R?, = R?X,. Х,Х, = .0495 

R2, =R? X, . X, X, -.3601 


Main effect A: 
F (В,-Ва,)/1 


- 21.606 p=.0001 
(1-8°,) /36 
Main effect B: 
(R, R*4)/1 
Е- = 2,779 =.1005 
(1-R°,) /36 
Interaction: 
E~ в» ) /1 
Fa =.222 p= .6456 
(1-8°, )/36 


(Compare these results with those from the ANOVA 
in Tables 4 and 5.) 


Notice that the F ratios are exactly the same 
as under the traditional analysis, and of course the 
conclusions would be the same. Also, additional in- 


formation regarding the regression equation is avail- 
able: 


Y = 23, 7X, + 8.5 X, + 24K, +21.2 
TABLE 6 


LINEAR CONSTRAINTS 


Group Cell А, тулат B; 
1 A,B, T =Т= zT 
2 A,B, 1 -1 -1 1 
ч 2B, 21 ІЗІ -1 


EXAMPLE PROBLEM: 3x3 LATIN SQUARE 


Often when a researcher plans to use a Latin 
Square or some incomplete block design, he finds that 
there is no program available to analyze his particu- 
lar design. However, there is always a good MR pro- 
gram available. If the researcher can use MR to 
analyze his data, then his choice of designs is not 
limited to the available computer programs. 


А numerical example of a 3x3 Latin Square ex- 
periment having one observation per cell will beused 
to illustrate the computational procedure. A Latin 
Square of this type can be thought of as a fractional 
replication of a 3x3 factorial experiment. A basic 


assumption of this design is that the interactions are 
negligible. 


Suppose that an experimenter is interested in 


TABLE 7 


DATA FOR COMPUTER CARDS 


Group Subject xX, х, X, 


H 1 1, $ 4; 23 

10 Ty 1, 1, 67 
ы : ін dio аш 30, 

ш ъ ч ыр 4% , 
а 1 oe ME" 42, 

10 E d ы, 12. 
4 

1 = эй, 1s 31, 
10 -1, 


"m 


a 
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TABLE 8 
NOTATION MATRIX 


b, bz bs 
a, ез с; e 
а; с, с; Co 
аз с; е, с; 


evaluating the relative effectiveness of three differ- 
ent schools (factor A) under three different methods 
(factor B) on three different ability groupings of 
students (factor С). This type of design is usually 
used when there are not enough Ss available for a 

full factorial design or the number required for a com- 
plete factorial is impractical, 


Based on the results of this experiment (see 
Tables 9 and 10), one can conclude that there is no 
significant difference for any of the factors, 


This same problem can be solved by using the 
MR technique. Table 11 shows the proper linear 
constraints which were based on membership or non 
membership in each of the three factors. 


The logic of the constraint is exactly the same 
as for a 3x3x3 factorial experiment but with no in- 
teraction terms present, 


The following multiple correlations were needed 
for the computations, 


тш, = R? =, 2274 
Ү. (1-6) 
R = R? =.0375 
Y. (3-6) 
TABLE 9 
OBSERVED SCORES 
B, в, B, 
А, 14 11 22 
А, 8 20 13 
Ay 17 6 1 


* The computational procedures used are shown in 


Winer ( 4:526). 
TABLE 10 
3 WAY LATIN SQUARE SOURCE TABLE 


Source 55 а MS Е 
А 49. 56 2 24,78 „246 
B 4,22 2 2.11 .021 
с 5,56 2 2,78 .028 
Еггог 201,55 2 100,78 

Total 260, 89 8 


TABLE 11 


LINEAR CONSTRAINTS FOR LATIN SQUARE 


Identity Ability Observed 
Vector Schools Methods Levels Scores 
у Jio XQ X. Хх X. v 


1 1 0 | 0 1 1 14 
1 1 0 0 1 0 1 11 
1 1 0 0 0 1 0 22 
1 0 1 l 0 1 0 8 
1 0 1 0 1 0 0 20 
1 0 1 0 0 0 1 13 
l 0 0 1 0 0 1 2% 
1 0 0 0 1 1 0 6 
1 0 0 0 0 0 0 T 
Бг, = R? = , 2061 
Y. (1-4) 
R°, =R? = ,2112 
Ү. (1-2) (5-6) 
Main effect A: 
(RY -,R?,) /2 
F= =. 246 
(1- F,)/2 
Main effect B: 
(К, = R?,) /2 
= -.021 
(1- 8,y2 
Main effect C: 
(R, - вау) /2 
F= = .028 


(1- R,) /2 


These are identical to the results obtainedfrom 
the conventional procedure. Note that if each cell 
represents more than one observation an entry into 
the data matrix for each S must be made, The num- 
ber of linearly independent vectors in RE^ is seven 
(u X, +... +X) and in R?, is five (шаа, 
+X,). Therefore, df, for the main effect of A = T- 
5=2. Since df, equals N minus the number of lin- 
early independent vectors in R,*, df, = 9-7-2, The 
df for main effects of B and C are similarly com- 
puted. 


LINEAR CONSTRAINTS 


Since one of the purposes of this paper is to help 
the reader with the difficult task of setting up the 
proper linear constraints, the constraints for a 2x 
3 and a 2x2x2 factorial problem are given in Tables 
12 and 13, It is suggested that the serious reader 
may want to try his hand at setting up these re- 
straints and use the following examples as a self 
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TABLE 12 


LINEAR CONSTRAINTS FOR A 2x3 FACTORIAL 
DESIGN* 


Cell X, X X, X, X, X, 
А,В, тіз o i 0 
А,В, 1 0 ip б 1 
А,В, 1 0 0 0 0 
А,В, 0 1 0 0 1 
А,В, 000 1 d 0 
А,В, 0 0 0 1 1 


*X, represents factor А.Х, and. X, represent factor 
B. X, and X, represent the interaction term, X, 
is left for the criterion score. Notethat X, is the 
product of X,. X, and X, is the product of X,. X,. 


check. Note that the number of variables neededfor 


any level corresponds with the degrees of freedom 
for that level, 


SUMMARY 


In general, the assumptions underlying the use 
of MR are identical to the ones justifying the use of 
Conventional procedures. MR assumes homogeneity 
of variance and that the Y distribution is normal, 
However, there can be marked deviation from these 
assumptions without seriously affecting the results 
as long as N is fairly large. 


TABLE 13 


LINEAR CONSTRAINTS FOR A 2x2x2 FACTORIAL 
DESIGN 


ABC 
Cell A B с АВ АС BC Criteria 
eee 
ABC, d 22 d ame Ed oi 1 
ABC i 0 1 бат б 0 
жарса S "0 1524075 О, 0 
A B,C, i 0 0 0 o 2 1 
A,B,C,0 1 1 UE от 0 
A,B,C, 0 0 1 7,1270 1 
A,B,C,0 1 0 0:6 0 1 
А,В,С, 0 0 0 1 To. 1 0 


So far it has been shown that MR is equivalent 


to several conventional techniques. Why Should one 
go to the bother of learning MR? It is the opinion of 


the authors that the MR technique has the following 
advantages: 


1, The regression values are provided along with 
the F ratios, which will allow for more meaning- 
ful interpretation. 


If there are other independent variables of in- 
terest it is easy to check their effect on Y. This 


is not necessarily true under conventional pro- 
cedures. 


e 


MR is a very flexible system that can simulate 
most models, Often there are only a limited 
number of programs available and it is difficult 
to find one to do a specific conventional test. 
MR is particularly useful for analyzing unusual 
patterns such as those associated with many 
quasi-experimental designs. 


If one is hypothesis hunting for further research, 
MR is an efficient way to search. The Veldman 
(3) Program Regran allows the use of one 
hundred variables, fifty regression equations 
and an unlimited number of F-tests. 


Under ANOVA techniques one usually tests in- 
teractions because they are part of the model 
and not because of some thought out rationale. 
The use of MR requires more careful formula- 
tion of hypotheses. 

6. MR allows the researcher to use both qualita- 
tive and quantitative independent variables, 


Most classical solutions are really just Special 
cases of a general MR analysis. It is the hope of 
the authors that this paper will encourage the reader 
to experiment with MR analysis and decide if MR has 


Some exciting possibilities for his 


particular 
interests, research 
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ABSTRACT 


Data from 169 high school students were rotated by means of an oblique factor rotation so that from a 27x 
27 variance-covariance matrix six factors were obtained. These factors were labeled: intellectual introver- 
sion, dependence, superego strength, ego-strength, independent orientation, and verbal aggression. Тһе re- 
sults suggested that the factors of intellectual introversion, dependence, superego strength, and ego-strength 
had concurrent validity. It was concluded that a factor analysis of the Edwards Personal Preference Schedule 
(EPPS), though questionable mathematically due to the fact that the EPPS is an ipsative measure, extracted 
psychologically relevant dimensions. The findings also suggested that introverted, socially controlled, mathe- 
matically apt students were not competing successfully with verbal, high ego-strength students in high school 


academic work. 


EDWARDS (6) presents evidence thatthe scales 
of his test are relatively independent. This finding 
was substantiated by Allen (1), who found essential- 
ly the same matrix of nonsignificant relationships 
with the exception that the scales Affiliation уз Nur- 
turance, Order vs Endurance, and Order-Deference 
showed significant positive relationships, while Def- 
erence vs Autonomy and Succorance vs Intraception 
showed significant negative relationships. Allen(1) 
suggests that these correlations show that items rel- 
evant to each of these eight variables (Order and 
Deference appear in two pairs) are, in a direct or 
increasing way, related to each other through an un- 
derlying unity along a personality continuum for each 
of the five pairs. 


А factor analytic approach using Н methodology 
to reveal the simple structure underlying the fifteen 
scales of the EPPS has normally been considered to 
be unacceptable by Cattell (3) and Guilford (7). 
They point out that the forced choice or paired com- 
parison method of item choice used in the EPPS 
makes it resemble an ipsative rather than a погша- 
tive measure. Guilford states that intercorrelations 


of ipsative measures over people using R methodol- 


ogy would, therefore, be improper. He also indi- 


cates that correlating ipsative scores with normative 


measures leads to correlations which are difficult 
to interpret. Stoltz (13) cautions further that the 
forced choice format may also tend to make normal- 
ly covarying responses diverge, due to the binary 
nature of the choice. Choice scores are therefore 
not independent estimates of the implicit probabili- 
ties of a response, but are dependent on each other, 
since no choice can be made without it affecting the 
possibility of another choice being made. 


Stoltz (13) points out another difficulty in factor 
analyzing using R methodology. The factor of social 
desirability should be controlled, which he suggests 
could be done by partial correlation methods. He 
also suggests that a routine factor analysis be at- 
tempted in which the K scales from the Minnesota 
Multiphasic Personality Inventory (MMPI), which 
both Edwards and Stoltz consider to be an adequate 
measure of social desirability, are included to max- 
imize the extraction of variance due to the socialde- 
sirability factor, should it occur. 


A factor analysis using an oblique rotation after 
Digman (4) was performed which included the fif- 
teen EPPS scales, verbal and quantitive scores on 
the SCAT V and Q, seven teachers’ ratings of stu- 
dents’ performance, and their rank in graduating 
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TABLE 1 
TEACHERS’ RATING SCALE 


o 10 20 30 40 50 60 70 80 90 100 
'URACY 
Мс) Work is poor. Makes Work is inaccurate Work is well done Work is of highest 
frequent errors. and below standard. and reasonably quality, 
accurate. 
COOPERATION (Sas l | КА 
(СООР) Disagreeable. Сап- Works with others Usually agreeable. Always agreeable. 
not or will not sometimes but has Generally willing Willing to do extra 
work with others. difficulty. to help. favors. 
EFFORT- | = 
INDUSTRY Does as little as Seldom completes U y does work Very industrious. 
(ЕЗІ) Possible. Ілгу. required work. that 1s required. Does extra work 
Occasionally does gladly. 
extra work. 
INLTIATIVE- | | | ДЕ. | 
LEADERSHIP Acts only under Seldom originates Plans many of his Marked ability to | 
G-L) direction. any work. Follows activities and think for himself, 
ott Still needs super- Ё 
vision. | 
t 
RELIABLLITY- | 
RESFONSI BILITY Neglects promise Reliable on some Usually dependable. Thoroughly 
(R-R) and obligations. occasions. Often Conscientious. dependable. 
Unreliable. needs supervision. 
РАОМРТНЕ55- EE ЕН | m „Йеке om 
PUNCTUALITY Undependable. Al» Frequently late. Usually on time Always on time. 
(P-P) most always late. but occasionally 
late. 
SELF- 
CONFIDENCE Timid. Hesitant. Appears to be over Wholesomely self- Shows superb solf- 
(S-F) Easily influenced. 


class from high school. 

V and Q, and rank in cla: 
measures. Teachers’ ratin 
and effort and industr 
on, Fukuda, and Berens 


Teachers’ ratings, SCAT 
SS (rank) are normative 
gs of accuracy (ACCU) 
y (E-I) have been shown by Dix- 
(5) to be highly predictive 


self-conscious. confident. assurance. 


criteria rather than mathematical ones were there- 


fore used to assess the conformity of the findings to 
psychological assumptions. 


METHOD 
of students’ performance in high school and can be 
interpreted as measures of conformity to teachers’ Subjects 
expectations in the high school setting. These rat- 


ings should therefore intercorrelate highly with those 
measures from the EPPS which maximize the dimen- 


sion of social desirability. 


The statistical technique of multiple linear re- 
gression, following Bottenberg and Ward (2), was 
used to assess the amount of independent contribu- 
tion each scale of the EPPS, SCAT V and Q, and 


The students were members of a graduating class 
of a large high school in Hawaii, Those selected for 
study had complete records on me: 

V and Q, teachers’ ratings, and rank. 
longed to five differ 
the same teacher. 


teachers’ ratings had in prediction of rank. A sim- 
ilar analysis including rank was performed with all 
variables testing their individual contribution to pre- 
diction of SCAT V and also SCAT Q. Thus, a check 
on the results of the factor analysis by means of 
multiple linear regression was made to answer some 
of the questions concerning the factor pattern re- 
vealed by the oblique rotation of all variables. 


The anal 
multiple li; 
dictive validity of the obli 


er administered 


ut 
the course of 2 s а througho 


chool days. One student failed to 

complete the EPPS, which left 169 students for the 
analysis. 
Instrumentation 
mentation 
onan for the graduating class were examined 

Scores, rank * rati re 
Secnrded ‚1 » and teachers ratings we 


; resulting in five ratings per Scale with the 
range being i nasi ate Table 1). The 
rating system thus conformed to the criteria given by 
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TABLE 3 


мы LOADINGS ON SIX FACTORS OF EPPS SCALES, IQ CLASS, TEACHERS’ RATINGS, AND R 


Intellectual Superego Independent Verbal 
| Introversion Dependence Strength Orientation Ego Strength Aggression 
1-пасһ 0.23 0.07 0.14 0.08 0.56 0.09 

2=ndef 0. 03 0.10 0.86 -0.14 0.08 -0.02 
| 3-nord 0.02 -0.03 0.80 0.08 -0. 03 0.16 
| 4-nexh -0.22 -0.15 -0.17 -0.11 0.74 -0.24 
{ 5 =n aut -0.10 0.03 -0.10 0. 68 0.17 0.14 
P 6- паї -0.37* 0.05 0. 00 -0.01 0.16 -0.79 
| 7 =n int 0. 31* 0. 37* 0. 36* 0.12 -0.20 -0.16 
[ 8 =n suc -0. 06 0.82 -0. 09 0.02 0.13 0.07 
i 9-ndom -0. 08 0.07 0.14 0. 06 0. 69* 0.02 
| 10 = n aba 0.07 0.22 0. 46* 0.02 -0.35* -0.15 
| 11-nnur -0.19 0.33 0.03 -0.09 -0.14 -0.48* 
| 12=achg 0. 37* -0.31* 0.29 0.14 -0.07 -0.28 
13 = n end 0.05 -0.05 0.86% -0.09 0.31* -0.09 
14 = n het -.40* 0.29 -0.22 0. 68* -0.09 0.03 
15 =n agg -0. 02 0.24 0.06 0. 08 0.12 0.61 
| 16 = соп 0.37* 0.07 250.09 0.09 0.05 -0.45* 
| "17 = IQ Class -0, 52* -0.12 0.29 -0.01 -0.14 0.06 
i 18 = SCAT V 0. 48* 0.08 -0.36* -0. 06 0.16 -0.10 
|| 19-SCAT Q 0. 54* 0.16 -0.21 0.02 0.13 -0.08 
| | 20- ACC ' — 0.80* 0.06 -0.01 0.01 0.02 -0.03 
| 21-COOP  . 0. 86* -0.06 | 0.56* 0.01 -0.05 -0.02 
| | 27=B- 0. 87* -0.02 0.02 -0.03 -0.07 0.01 
q 23 =I-L қ 0. 81* -0.01 -0.04 0.01 0.01 0.01 
METERS 0. 86* 0.01 0.08 -0.01 -0. 02 0.02 
|2 25-P-P 0. 88* -0.02 0.05 -0. 06 -0.07 0.04 
| 26 = S-F 0. 75% -0.05 0.01 0. 05 0.07 -0.01 
| sho -0. 75* -0.08 0.15 0.05 -0.02 -0.01 


*Meets criterion of +. 30. 


Rugg (12), who recommends the use of at least three intervals of the scale, such as 0, 10, or 20, butwere 

judges and approaches the number set by Symonds placed at any point which the teachers considered ap- 

(14), who recommends the use of eight judges. The propriate. The teachers placed checks at the point 
they considered appropriate for each student. The 


ratings made by the teachers were not confined to the 
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TABLE 2 


INTERCORRELATIONS BETWEEN EPPS SCALES, SCAT V AND Q, TEACHERS’ RATINGS, AND R 


1 2 3 4 5 6 7 8 9 10 11 12 13 
1=п ach 1.00 -0.17 -0.11 0.32 0.13 -0.29 -0.22 0.05 0.35 -0.37 -0.43 -0.04 -0.05 
2-n def 1.00 0.32 -0.10 -0.24 -0.04 0.10 -0.14 -0.15 0.19 0.06 -0.10 0.21 
3-n ord 1.00 -0.25 -0.18 -0.17 -0.05 -0.17 -0.25 0.10 -0.01 0.35 -0.20 
4-n exh 1.00 0.10 -0.07 -0.17 -0.02 0.32 -0.30 -0.13 0.04 -0.19 
5-n aut 1.00 -0.20 -0.20 -0.05 0.14 -0.25 -0.31 0.07 -0.23 б 
6=n aff 


1.00 0.06 -0.03 -0.25 0.19 0.51 0.05 -0.10 

T=n int 1.00 -0.06 -0.27 0.24 0.24 0.00 -0.08 

8=n suc $ à . 

9-n dom | 


1.00 -0.37 -0.31 -0.12 0.00 
» 
10=n aba 


1.00 0.29 -0.03 0.17 
11=п nur 
а 1.00 0.00 
13=п епа 1.00 
14=n het 

15=n agg 

16=n con 

17-IQ class 

18=SCAT V 

19-SCAT Q 

20-ACCU 

21-COOP 

22-E-I 

23-I-L 

24-R-R 


25-P-P 


A 
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14 15 16 17 18 19 20 21 22 23 24 25 26 27 

А -0.16 0.15 0.00 -0.39 0.41 0.31 0.33 0. 26 0.28 0.32 0.30 0.31 0.26 -0.37 
i -0.23 -0.07 -0.14 0.31 -0.28 -0.24 -0.16 -0.03 -0.06 -0.12 -0.08 -0.03 -0.14 0.17 
-0.20 -0.08 -0.17 0.31 -0.34 -0.26 -0.20 -0.15 -0.15 -0.23 -0.17 -0.13 -0.17 0.24 
-0. 05 0.13 -0.01 -0.14 0.13 0.11 0.08 0.09 0.04 0.15 0.05 0.07 0.18 -0.10 
0.13 0.17 0.01 -0.10 0.15 0.09 0.07 -0.01 0.03 0.06 0.04 0. 00 0.04 -0.08 
10.06 -0.48 0.18 0.10 -0.03 -0.02 -0. 04 -0.04 -0.05 -0.10 -0.12 -0.09 -0.09 0.08 
-0.16 -0.30 0.05 -0.14 0.11 -0.04 0.15 0.22 0.23 0.19 0.21 0.18 0.10 -0.13 
0. 08 0,08 0.04 -0.16 0.15 0.20 0.10 0.03 0.09 0.09 0.06 0.06 0.00 -0.16 
-0.10 0.25 -0.10 -0.19 0.22 0.15 0.11 0.11 0.10 0.14 0.10 0.07 0.18 -0.17 
20. 16 -0.12 0.14 0.23 -0.21 -0.16 -0.06 -0.04 -0.03 -0. 15 -0. 04 -0.02 -0.14 0.10 
-0.13 -0.24 -.15 0.09 -0.09 -0.08 -0.06 -0.07 -0.06 -0.10 -0.06 -0.02 -0.09 0.08 
-0.01 -0.09 0.14 -0.16 0.11 0.05 0.18 0.23 0.23 0.20 0.24 0.22 0. 15 -0.20 
-0. 30 -0. 02 20. 09 0.14 -0.17 -0.04 -0.07 -0.03 -0.06 -0.10 -0.06 -0.01 -0.12 о. 07 
1.00 0.04 -0.02 -0.01 0.02 -0.01 -0.18 -0.20. -0.19 -0.13 -0.20 -0.21 -0.12 0.11 
1.00 -0.19 -0.02 -0.05 -0.09 -0.06 -0.09 -0.09 -0.04 -0.04 -0.05 0.04 0. 08 
` 1.00 -0.33 0.30 0.39 0.36 0.39 0.34 0.33 0.34 0.35 0.24 -0.35 
1.00 -0.83 -0.62 -0.69 -0.61 -0.60 -0.67 -0.60 -0.56 -0.55 0.73 
1.00 0.65 0.67 0.58 0.56 0.63 0.57 0.54 0.52 -0.73 
1.00 0.67 0.53 0.55 0.58 0.57 0.55 0.41 -0. 68 

1.00 0.86 0.92 0.90 0.90 0.84 0.75 -0.87 

1.00 0.91 0.90 0.94 0.87 0, 80 -0.75 

1.00 0.90 0.93 0.86 0.75 -0.82 

1.00 0.90 0.82 0.84 -0. 80 

1.00 0.91 0.79 -0.80 

1.00 0.70 -0.73 

1.00 -0.65 
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TABLE 4 
INTERCORRELATIONS AMONG FACTORS 


Factor ады Перепдепсе Вр оа Ego Strength "Poco m 
1 1.00 0.17 -0.30 0.16 0.33 -0.22 
2 1.00 -0.11 0.25 0.05 -0.31 
3 1.00 -0.17 -0.39 -0.04 
4 1.00 0.36 0.09 
5 1.00 0.42 
6 1.00 


SCAT was administered in the fall of the students' 
senior year in high school. Rank in class was based 
on the first five semesters of work in high school. 


Procedure 


The marks placed by the teachers on the rating 
Scale of each student were measured from the zero 
point on the scale in centimeters to one place accu- 
racy. Thus, for the scale of accuracy a student 
might have scores of 9.6, 9.6, 9. 8, 8.5, and 8.9, 
as measured in centimeters. For each stude nt, 
eleven different school related scores were obtained: 
IQ class, Verbal (SCAT V), Quantitative ( SCAT Q), 
averages of the teachers’ ratings on the scales of 
Accuracy (ACCU), Cooperation (COOP), Effort and 
Industry (E-I), Initiative and Leadership (I-L), Re- 
liability and Responsibility (R-R), Promptness and 
Punctuality (Р-Р), Self-Confidence ( S-F), andranks. 
Scores for the fifteen EPPS scales, including the 
consistency scale, were obtained for each student. 
These were need for achievement (n ach), deference 
(n def), order (n ord), autonomy (n aut), affiliation 
(n aff), intraception (n int), succorance (n suc), 
dominance (n dom), abasement (n aba), nurturance 
(n nur), change (n chg), endurance (n end), hetero- 
sexuality (n het), aggression (n agg), and consis- 
tency (con, a measure of reliability of response ). 


RESULTS 


The subroutines (PERSUB) of Bottenberg and 
Ward (2) and an IBM 360-50 computer were used 
to determine the extent to which post-high school 
destination and sex variables contributed to differ- 
ences in SCAT, teachers’ ratings, and rank. PER- 
SUB was used to perform the statistical technique of 
multiple linear regression to determine F ratios and 
exact probability values to 4-place accuracy. Mul- 
tiple linear regression using PERSUB presents an 
efficient method for programming computation ofthe 
estimated probability for any specific F value. The 
estimate is based on the actual distribution of scores 
and may be used with degrees of freedom ranging 
from 4to 1,000 (11). The F ratio is computed be- 
tween a full model regression equation containing 
all predictor variables under consideration and a 
restricted model. 


In the restricted model in this analysis, the in- 
formation about a particular variable is not included 
in the restricted model's equation, and the predic- 
tive efficiency of this equation is compared with the 
predictive efficiency of the full model's equation, in 
which all the information for each variable is includ- 
ed. For example, in predicting IQ scores the full 
model would include the male-female differences in 
the population, while the restricted model omits the 
information that male-female differences existed. 

If knowledge of male-female differences helped in 
the prediction of IQ scores, there would be a signif- 
icant difference using the F ratio statistic between 
the equation containing male-female differences and 
the restricted model, which does not include this in- 
formation. 


The variables were anal 
analysis (8), 
to factor analysis. 


of the scores, i.e., 
predictable 
the n-1 oth 
image sco 
formed ап 
relation matrix ma: 


tinct advantages: (1) 
be estimated, as it 
sense of a squared 
the variance to be 
Thurstone model, 

» 1.е., the matrix is 


pretation would result from its 


use in place of the 
more traditional methods. ы 


dn the present study, the variance-covariance ma- 
trix of image scores was reduced by the principal- 
axes method to six factors. These were then trans- 
formed (“rotated”) by Digman’s (4) variant of the 
Harris-Kaiser (9) method of oblique transformation, 


a technique which appears to represent a substantial 


advance in the field of factor rotation. ” 
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The correlations among the twenty-seven vari- 
ables are shown in Table 2; six factors were found 
to have an eigenvalue 21.00. Following the rule of 
Kaiser (10), the solution was obtained by rotating 
the factors with eigenvalues greater than 1. The 
variance-covariance matrix was rotated by the Vari- 
max procedure with the results shown in Table 3. 

No allowance was made for differences between the 
sexes in this analysis due to the small number of 
students in the sample. 


Table 4 shows the intercorrelations (factor co- 
sines) among the six factors derived from the factor 
analysis. Table 4 shows, using as criterion a factor 
correlation of *.30, that Factor 1 is positively re- 
lated to Factors 3 and 5, Factor 2 is negatively re- 
lated to Factor 6, Factors 3 and 5 are negatively re- 
lated, Factors 4 and 5 and Factors 5and 6 are 
positively related. 


The statistical technique of multiple linear re- 


. gression was used with a full model of all variables 


predicting to the criteria of interest. Each variable 
was omitted sequentially but one at a time to assess 
the amount of individual contribution in prediction of 
the criterion. a 


The first criterion used for prediction was rank, 
which was considered important because it was a 
measure of academic achievement in a high school 
setting. SCAT V, COOP, and E-I showed signifi- 
cant prediction of rank (p<. 05). Teachers’ ratings 
of ACCU showed significant independent contribution 
in prediction-of rank (p<. 01). 


A further test was made with all variables pre- 
dicting to the criterion, SCAT V, as shown in Table 
6. This was considered important since the SCAT 
are widely used and readily available from the Edu- 
cational Testing Service Cooperative Test Division, 
Princeton, New Jersey. Those variables showing 
significant independent contribution to SCAT V were 
SCAT Q and rank (p<.05). Those showing signif- 
icant prediction at the . 01 level were n ach, Е-І, 
and, obviously, IQ class. 


In prediction of SCAT Q the variables n int, E-I, 
and SCAT Q made significant independent contribu- 
tions (p«. 05). In addition to these variables, 
teachers’ ratings of ACCU and S-F made significant 
independent contributions to prediction of SCAT Q 
at the . 01 level of significance. 


DISCUSSION 


The results of the 6-factor rotation to an oblique 
solution (see Table 3) shows that for the first factor 
the following variables meet the criteria of *. 30 (the 
criterion £. 30 was chosen in order that each vari- 
able would load on at least one factor): affiliation 
(the need to form friendships) is negatively related, 
intraception (the need to observe and analyze one's 
own and other's feelings and motives) is positively 
related, change (the need for novelty in everyday 
things) is positively related, heterosexuality (the 
need to have social and physical relationships with 
members of the opposite sex) is negatively related, 
all the teachers’ ratings, SCAT V and 0, R and IQ 
class are positively related (the last two variables 
have negative signs, since smaller scale values are 
nigher rankings). The consistency or reliability 


measure of the EPPS weighs positively on this fac- 
tor also. 


Since teachers' ratings of ACCU and R-R can be 
said to be measures of conformity to teachers' ex- 
pectations in the school system, they may also be 
said to be variables that maximize social desirabil- 
ity. Factor 1 is therefore a reliable factor which 
extracts the major part of variance due to the pres- 
ence of social desirability in the EPPS. Consider- 
ing Factor 1 as a scale, it may be said to be mea- 
suring intellectual introversion within a high school 
setting. Social conformity, as a tendency to answer 
items in a socially desirable way, is positively re- 
lated to success in high school according to this 
analysis. 


Factor 2 was labeled the dependency factor, since 
succorance (the need to seek encouragement, help, 
and affection from others) showed the highest load- 
ing, intraception loaded positively as did nurturance 
(the need to help friends and others less fortunate), 
and change met criterion with a negative weighting. 
Intraception is shown to be a complex variable, since 
it shows positive loadings on this factor as well as 
on Factors 1 and 3. The reliability of this factor 
cannot be gauged by the factor loading of the consis- 
tency measure of the EPPS, which is negligible for 
this factor: it must be judged in relation to the other 
factors in regard to its psychological relevance. The 
individual with a high score on a scale made from the 
items meeting criterion on Factor 2 would be a de- 
pendent, kindly, introverted person with a tendency 
to support the status quo. 


Factor 3 shows the highest loading for deference 
(a need to find out what others think), second highest 
for endurance (need for job completion), order 
(need for neatness and planfullness), abasement 
(self-punishment for perceived guilt), intraception, 
and a negative loading for SCAT V. This factor 
could be labeled the strong superego factor. An in- 
dividual with a high score on a scale from the items 
meeting criterion on Factor 3 would be retiring, 
self-effacing, and nonverbal. Table 4, which gives 
the correlations among factors, reveals that Factors 
1 and 3 are negatively related (criterion *.30). 
Thus, the intellectual introvert is typically not the 
individual with powerful superego development. 
Consistency does not reach criterion on Factor 3, 
however, so it must be judged on psychological rath- 
er than statistical grounds. 


Factor 4 has, as its highest loading, autonomy 
(the need for independence and freedom from con- 
straint). The next highest loading is heterosexual- 
ity (the need to have social and physical relations 
with members of the opposite sex) and change. High 
scorersfromascale derived from Factor 3 would 
have a need to be independent from societal бапс- 
tions. A freedom to express heterosexual interest 
would result from their subjective independence from 
societal constraints. 


Factor 5 has for its highest loading the variable 
exhibition (the need to be noticed and to have one’s 
personal achievements talked about), the next fac- 
tor loading weights were dominance (the need to be 
a leader and to influence others), achievement (the 
need to be successful), and endurance with a nega- 
tive loading for the need for abasement. The indi- 
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TABLE 5 


THE INDIVIDUAL CONTRIBUTION OF EPPS 
SCALES, TEACHERS’ RATINGS, IQ CLASS, AND 
SCAT V AND Ө IN PREDICTION TO R 


2 


Source R F 
«Ss sss 
Full R 0. 8304 
Achievement 0. 8292 1.02 
Deference 0. 8285 1.59 
Order 0. 8304 0.02 
Exhibition 0. 8303 0.09 
Autonomy 0. 8302 0.12 
Affiliation 0. 8299 0. 40 
Intraception 0. 8304 0.01 
Succorance 0. 8380 1.98 
Dominance 0. 8284 1.62 
Abasement 0. 8291 1.09 
Nurturance 0. 8298 0.50 
Change 0. 8277 2.23 
Endurance 0. 8289 1.28 
Heterosexuality 0. 8297 0.58 
Aggression 0. 8283 1.71 
IQ Class 0. 8265 3.40 
SCAT V 0. 8238 5.49* 
SCAT Q 0. 8274 2.47 
ACCU 0. 8169 11. 32** 
COOP 0. 8254 4.16* 
E-I 0. 8241 5.23 
I-L 0. 3801 0.26 
R-R 0. 8288 1.34 
P-P 0. 8287 1.42 
Self-Con 0. 8304 0.01 


NOTE: Full model regression equation: unit vec- 
tor + Edwards scales % 10 class + SCAT V and Q 
+ teachers’ ratings predicting to R. 


*p<.05 level of significance. 
*жр<, 01 level of significance. 


df=1/142 


vidual choosing items which would score positively 
on a scale made from this factor would be high in 
ego-strength. 


Table 4 indicates that the pairs of Factors 1 and 
5, and4and5 are positively related. Thus, intel- 
lectual introversion, independent orientation or out- 
look, and ego-strength form a constellation of per- 
sonality factors which relate positively to one 
another. The sixth factor also relates positively to 
Factor 5. 


The variables meeting criterion for Factor 6are, 
in order of importance, affiliation, negative loading; 
aggression (the need to attack, to argue, to become 
angry) with a positive loading; nurturance, negative 
loading; and consistency, negative loading. If con- 
sistency is a measure of reliability and not a person- 
ality variable, which might be construed from the 
variable’s loading on Factor 6, then the reliability 
of Factor 6 is probably very poor and its meaning 
doubtful. The factor of social acceptance or con- 
formity ( Factor 2) relates negatively to Factor 6. 
A negative relation between the expression of hos- 
tility and aggressive impulses and social control 
would be expected from a psychological point of 
view. 


Intellectual introversion is related most closely 
to performance in high school, but measures of ego- 
strength also relate to academic performance. In- 
dividuals rating high on ego-strength would also tend 
to be verbally aggressive, autonomous in their out- 
look upon societal constraints and would tend not to 
have a strongly intra-punitive superego. The intel- 
lectual introvert would not tend to have a need to ac- 
cept blame and guilt and be deferent. Thus, the 
self-analytical, high ego-strength individual, rather 
than the dependent and self-abasing person, would 
be best suited to achieve in the high school milieu. 
The successful high school student would probably be 
well-structured and planful and would do neat, accu- 
rate, and careful work. ^ 


The predictive validity of each variable was as- 
sessed using multiple linear regression to find which 
variables showed significant independent contribution 
to prediction of R. SCAT V, COOP, and E-I showed 
significant independent contributions to prediction of 
R (p<. 05), and teachers’ ratings of ACCU showed 


of a personality dimen- 
WS predictive validity 
onment. 


The predictive validity of all variables was as- 
Sessed in relation to SCAT V, as is shown in Table 
8. ESAT Q and R showed significant independent 
contributions to SCAT V (р<. 05). Achievement, 

апа E-Ialso showed prediction of SCAT 
than the . 01 level of Significance. Those 
ell on SCAT V would thus be in a more 56" 
eterogeneously grouped class, show a high 
achievement, be rated as industrious in 
rank high in their graduating class from nigh 


Scoring w 
lective, h 
need for 
School, 


school, and score relatively well on SCAT Q. Need 
А for achievement is related to Factor 5 ( ego- 

4 strength), which is also related to the verbal abil- 
ities factor, as is indicated above. The EPPS scale 
of need for achievement and probably Factor 5 ofthis 
factor analysis might therefore be said to have some 

| substantiation of their importance in a high school 
setting from the test of predictive validity. Ego- 

strength and verbal ability are probably important 

| factors in prediction of success in high school. 
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| A further prediction was made to SCAT Q using 
all other variables. Intraception, SCAT V, and E- 
I showed significant prediction of SCAT Q (p<. 05). 


А ACCU showed significant independent contribution 
F to prediction of SCAT Q also (p<. 0). (It should 
PN be noted that R did not make significant independent 


contribution to SCAT Q. ) A student scoring well on 
SCAT Q would then be accurate, neat, industrious, 
and introverted. Thus, the high SCAT Q student 
would differ on salient personality dimensions from 
the high SCAT V student, according to this analysis, 
| even though SCAT V and Q show significant predic- 
! tion of each other. The relationship between need 
for intraception and Factor 1 is shown to be in rela- 
tion to the quantitative dimension of intellectual func- 
tioning which is part of Factor 1. Intraception, the 
self and other analytical qualities revealed by this 
| scale, in addition to its presence in Factor 1, is 
factorially complex, since it is also part of the so- 
" cial conformity and superego-strength factors. The 
significant prediction of SCAT Q by intraception may 
thus be said to partially validate the loading of intra- 
\ ception on Factor 1. Ву contrast, need for achieve- 
| ment is related to SCAT V and Factor 5 (ego- 
i strength), and SCAT V is predictive of R as well as 
being part of Factor 1. The ambitious, verbal 
ü student might well be said to be at an advantage in 
| a high school milieu, while the controlled, quiet, in- 
| troverted, mathematically apt student is not. The 
academic potentials of the latter student may well 
go unrecognized and unrewarded in an academic mi- 
lieu of this nature. Counseling and selection of stu- 
dents from high school for college work should take 
into account that an important segment of the aca- 
demically gifted students in a high school popula- 
tion, to some degree, go unrecognized as compared 
with their highly verbal, achievement-o riented 
peers. 


SUMMARY AND CONCLUSION 


i The EPPS was administered to 169 students in 
five homogeneously grouped classes varying from 
high to low ability. Measures from a teachers’ rat- 
ing schedule of classroom performance, SCAT V 
and Q and rank in graduating class were obtained 


from the school records. 
` 


An oblique rotation was performed on all twenty- 
seven variables using R methodology to obtain the 
simple structure. This revealed six factors which 
were labeled: (1) intellectual introversion, (2) de- 
Í pendence, (3) superego strength, (4) independent 
| ~orientation, (5) ego-strength, and (6) verbal ag- 
) gression. It was found that intellectual introversion 
was negatively related to superego strength and pos- 
itively to ego strength; dependence was negatively 
» related to verbal aggression; superego strength and 
ego-strength were negatively related; while the parts 
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TABLE 6 


THE INDIVIDUAL CONTRIBUTION OF EPPS 


39 


SCALES, TEACHERS’ RATINGS, IQ CLASS, R, 
AND SCAT Q IN PREDICTION TO SCAT V 


2 


Source R F 
Full V 0. 7675 
Achievement 0. 7568 6. 56** 
Deference 0. 7667 0. 46 
Order 0. 7674 0.04 
Exhibition 0.7659 0.99 
Autonomy 0.1625 3.04 
Affiliation 0. 7654 1.29 
Intraception 0. 7628 2.89 
Succorance 0. 7671 0.25 
Dominance 0. 7629 2.81 
Abasement 0. 7671 0.26 
Nurturance 0.1666 0.56 
Change 0. 7673 0.10 
Endurance 0. 7673 0.10 
Heterosexuality 0.7635 2.43 
Aggression 0. 7684 0.00 
IQ Class 0. 6908 46. 84** 
SCAT Q 0.1594 4.94* 
ACCU 0. 7656 1.17 
COOP 0.7642 2.04 
E-I 0. 7569 6.45 
I-L 0. 7674 0. 07 
R-R 0. 7673 0.09 
P-P 0.7675 0.01 
Self-Con 0. 7673 0.10 
Rank 0. 7585 5. 50* 


NOTE: Full model regression equation: unit vec- 


tor = 15 Edwards scales + IQ class + SCAT Q + 


teachers’ ratings + R predicting to SCAT V. 


жр<.05 level of significance. 


**p«.01 level of significance. 
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TABLE 7 


THE INDIVIDUAL CONTRIBUTION OF EPPS 
SCALES, TEACHERS’ RATINGS, IQ CLASS, R, 
AND SCAT V IN PREDICTION TO SCAT Q 


Source в? Р 
Full Q 0. 6223 
Achievement 0. 6180 1. 63 
Deference 0. 6202 0.79 
Order 0. 6205 0. 66 
Exhibition 0. 6210 0. 50 
Autonomy 0. 6213 0.37 
Affiliation 0. 6223 0. 00 
Intraception 0. 6069 5. 78* 
Succorance 0. 6204 0.73 
Dominance 0. 6221 0.09 
Abasement 0. 6207 0. 61 
Nurturance 0. 6202 0.78 
Change 0. 6138 3.18 
Endurance 0. 6205 0. 66 
Heterosexuality 0. 6223 0.00 
Aggression 0. 6144 2.98 
IQ Class 0. 6209 0.51 
SCAT V 0. 6091 4.96* 
ACCU 0. 6025 7.45** 
COOP 0. 6221 0.08 
E-I 0. 6121 3.85* 
I-L 0. 6198 0. 93 
R-R 0. 6193 1.12 
P-P 0. 6200 0. 87 
Self-Con 0. 6030 7.25** 
Rank 0. 6157 2.41 


NOTE: Full model regression equation: unit vec- 
tor + 15 Edwards scales + IQ class + SCAT V + 
teachers’ ratings + R predicting to SCAT Q. 
*p<.05 level of significance. 


жжр<,01 level of significance. 
dí-1/142 


of the factors independent orientation and ego- 
strength, and ego-strength and verbal aggression, 
were found to be positively related. 


А factor analysis and a prediction study usingthe 
EPPS, teachers’ ratings of classroom behavior and 
a standardized test of verbal and quantitative skills 
showed that highly verbal students with greater ego 
strength (need for achievement) tend to dominate in 
high school grades. The mathematically apt student 
who has greater superego strength (accuracy and ef- 
fort and industry) and is perhaps more dependent, 
does not. Counseling of students in which quantita- 
tive skills are considered an important part of intel- 
lectual functioning is suggested if an important seg- 
ment of the intellectually gifted student body is not 
to be neglected. Factor analysis ofthe EPPS, though 
questionable from a strictly mathematical viewpoint, 
does reveal a simple structure with a strong sugges- 
tion of psychological relevance in a high school set- 
ting. 


FOOTNOTES 


1. This research was supported by NSF funds ad- 
ministered by the research council of the Uni- 
versity of Hawaii. We are indebted to John М. 
Digman and Elsie Н. Ahern for valuable assis~ 
tance with this paper. 


2. The factors obtained in this study resemble the 
Milton-Lipetz solution though item pairs were 
not eliminated in this study. They found five 
factors which were a need for interpersonal re^ 
lationship and aífiliation, need for hostile de- 
dependency, need for status dominance, nee 
for structure and orderliness, and need for 
freedom or independence. Milton, С. А.; 
Lipetz, M.E., ‘‘The Factor Structure of Needs 
as Measured by the EPPS,’’ Multivariate Be- 
havioral Research, 1:37-46, 1968. 
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Book Reviews 
(continued from page 21) 


his own concepts, to be intellectually self-changing. But Borton's paradigm includes more, While classicaled- 
ucation confined discovery and self-change to the cognitive domain, Borton extends it to the affective domain: 
the student will learn the skills and processes through which he can change his relationship to himself and to 
others. It is the inclusion of intra and interpersonal skills and processes which distinguishes Borton’s approach 


from the classical view. 


In the first part of the book, Borton desc ribes his earliest attempts to involve students in their education. 
Like many contemporary educational innovators, Borton’s teaching experiences were intimately tied tothe urban- 
student struggle with issues of race and identity. His efforts were directed toward legitimizing white and black 
students’ exploration of feelings about what it means to be white or to be black; Borton believed that for students 
to get involved in school, they must begin with their own concerns. While Borton describes success in enabling 
students to explore their concerns, he also describes failure in moving students from these concerns to the at- 
tainment of broader educational goals. His students, although involved in personally relevant issues, were not 
learning the skills of self-change. 


Borton presents a simple information processing model 
indicating which processes or skills should be taught. The model consists of three stages: sensing, transform- 
ing, and acting. According to this model, the student needs to learn the skills of experiencing his world (sens- 
ing), of analyzing the meaning of his experience (transforming) , and of responding in new ways based on his 
analysis (acting) . 


To facilitate movement toward this broader goal, 


E This three-stage model raises some interesting theoretical questions: How do children ‘‘sense’’the world 
at different ages? What cognitive and affective processes are necessary for understanding one’s own experience? 
What skills are necessary for the behavioral application of cognitive understanding? 


In regard to how these skills are taught, Borton uses three terms corresponding to each information pro- 
cessing stage: ‘‘What,’? “So What,"' and “Then What." The “Whats” are experiences of the student; the ‘‘So 
Whats” are the analyses and cognifications of the experiences; the ‘‘Then Whats” are the action implications fol- 
lowing from the analyses. Theseactions, when carried out, provide new data for the information processing cy- 
cle. Thus learning proceeds from experience to cognification to application to new experience. It is, funda- 
mentally, an inductive approach to learning. 


Some important pedagogical questions emerge: How does one construct meaningful experiences for child- 
ren? How can one proceed sensibly from an experience to the cognification of that experience? How canthe child's 


applications of his experience be monitored to ensure continued growth? 


If the book suffers it is in its application of the model. For the model to come alive, teachers need meth- 
ods for assessing student concerns, and they need to know the kinds of experiences which they can provide to 
connect with those concerns, Borton provides a wealth of ideas for locating specific student concerns; these 
ideas range from the use of role playing to the use of poetry. Similarly, he provides activities that can be used 
to teach processes while connecting with the assessed student concerns; these include fantasy writing and sim- 
ulation games. Unfortunately, his ideas appear to be more a function of his own creative spark than to be meth- 
ods which other teachers can systematically utilize. It may be premature to expect a new paradigm to present 
a complete methodology for assessment and instruction; nonetheless, the problems which remain might have 


been more clearly articulated. 
| (continued on page 51) 
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CHILDREN’S LITERARY SKILLS 


HOWARD GARDNER and JUDITH GARDNER! 
Harvard University 


ABSTRACT 


Twelve Ss at each of four age levels performed a story completion and retelling task designed to meas- 
ure children's skills of understanding, retelling, and creating literature, and to test various theoretical for- 
mulations about aesthetic development, Sensitivity to literary style was also examined, Though Ss 11 or 12 
years of age generally evinced the most literary skill, a few children at each age level were outstanding, Char- 
acteristic Performances at each age level, individual differences, trends in the development ofliterary skill, 


IN DESCRIBING children’s competence in the graders (modalage 8), twelve sixth graders (modal 
literary realm Several competences seem worthy ages 11-12), and twelve ninth graders (modal ages 
of examination: (1) capacity to follow the mani- 14-15) were selected at random from two Schools 
fest plot of the story; (2 ) sensitivity to subtler nu- having a predominantly middle-class population,? 
ances—tone, style, underlying forces and tensions; Both sexes were equally represented in the sam- 
(3) ability to tell and retell stories to othe rS; ple. 

(4) skill in Creating novel plots, developing themes. 
and/or choosing wordsaptly, Though a number of Materials 


with the subject matter of the stories or the rela- Twoplots, one original, the other adapted from an 
tionship between the child's personality and his 0а роет, were each recast by the first author into 
literary output. As а result, little is known about two markedly divergent Styles, Style A was reminis- 
the extent and range of children's skills in creat- centoffairytales, with long complex Sentences, re- 
ing, communicating, and comprehending stories, mote setting, antiquated dialogue a neutral pose to- 
To probe these Skills, the authors deviseda Story- Ward readerand Subject matter, (Once upona time 
completion and retelling task which could be ad- many years ago, in a little town far across the sea, 
ministered to diverse age groups. Differences there liveda man who, , .) Style more cont, о- 
across and within age groups were of interest as rary and colloquia], featured Short punge; ub 
well as indices Suggesting a general trend of liter- tences, Slang, numerous exclamations Ping id à 
ary development, The relationship between liter- by the narrator, (Here’s a storyabout a тен к 
ary skills and other kinds of cognitive capacities шап. A guy with more bags of gold than pes 
was of particular interest and the following possi- dreamed of, ) Each story totaled about 350 hail 
bilities were considered: literary skill improves Style A, 300 words inStyle B; both featured ved 2 nd 
gradually with age; adolescents are Significantly evil characters anda central Crisis, 3 не 
more skilled than preadolescents at handling lit- т 

erary tasks, just as they are more skilled at other 

aesthetic assignments (2 7); the capacity to per- Proce 

form logical салын талу nein literary compe- =Focedure 

tence (5); а тајог ѕригі іп literary development 

Occurs at about age 7, as in other cog- So that each Scouldhave the o ortunit: 

nitive realms (4, 10), in both styles, a 2-phase roses ке с 


METHOD experimenters (Е, and Е,) situated in two rooma 
UM (А and B) was employed, 
ects 


In phase I, $ ent 
Twelve first graders (modal age 6), twelvethird im twi б vin Room А where E, read to 


Се a “story without an ending.” S was in- 


D 


£ 


е. 
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structed to attend carefully to the story and then 
“Чо make up an ending which you like and which 
sounds right for the story." During this phase, 
S heard either plot 1 or plot 2 in Style A or Style 
B. After S created his ending, he was told togoto 
Room B and to retell the story hehadheardandthe 
ending he had constructed to E, who **does not know 
the story. ” In order to create a set for exact rep- 
etition, S was reminded of the initial sentences of 
the original just before proceeding to Е,. 


In phase II, the crucial parts ofthe initial phase 
were repeated in Room В. Afterhehadrelatedthe 
original story and ending, S was told the second 
plot in the second style and asked to make upan ap- 
propriate ending. Thereafter he was instructedto 
return to Room A and to retell the second story 
and its ending to E,. Upon completion ofthe story 
retelling to E, S wasaskeda few generalquestions. 


This rather cumbersome procedure insuredthat 
each S heard both of the plots and both ofthe styles 
and could reveal his assimilation of them in a nat- 
ural situation, The entire session was tape recorded 
and later transcribed, Order of story, style pre- 
sentation, and experimenter were counterbalanced. 


Measures 


А Set of new measures was devised for this ex- 
ploratory study. Capacity for recall was deter- 
mined by noting how many of the six major facts 
and the twenty-six details in each story were in- 
cluded in the retelling. Children received a point 
for each fact and 1-2 points for each detail recalled. 
Understanding was judged by the contents ofthe re- 
call (did S get the point of the story?) andthe de- 
gree to which the endings took into account the 
important conflicting forces in the stories, Cre- 
ative ability was assessed by an originality score 
based on uniqueness and aptness of the endings; 
protocols were coded independently by the authors 
and given an originality score from 0 to 6. Inter- 
judge reliability was .97. 


л Cutting across these skills, and of particular 
interest, was the child's sensitivity to literary style 
(3) . The six major ways in which the two styles 
differed were specified and Ss were independently 
Scored on the extent to which their endings and their 
retellings possessed these distinctive character- 
istics. Protocols could receive scores from 0— 
24; Ss scoring at least 5 were considered possibly 
sensitive to style, those scoring at least 10 defi- 
nitely were sensitive to style, A high score was 
possible only if a Ss; protocols differed significant- 
ly in style. Reliability was ,94 for endings, .90 
for retellings. 


RESULTS 


Three trends recurthroughout the data analysis; 
the lower scores of the first graders; the cluster- 
ing of the means of the three older groups, with 
the sixth graders generally scoring higher; the wide 
range of scores on most measures at each level. 
The slightly greater skills of the sixth graders and 
the wide range of scores are indicated by the sum- 
mary Statistics in Table 1. 


Because of the clustering tendencies, two anal- 
ses of variance were performed for each ofthe prin- 
cipal measures: a one-way analysis on all fourage 
groups, and a one-way analysis on the three older 
groups. Results are presented in Table 2 and drawn 
on in the discussion of specific findings. 


Discussion of Specific Factors 


Length and Nature of Endings. In this readily 
measured category, the findings of the overall study 
are nicely epitomized, First graders differed sig- 
nificantly from the older groups, generally adding 
only a few words to the stories; older children usu- 
ally added at least one hundred wordstothe stories, 
covering a number of events and often attempting to 
balance the forces of good and evil presented at the 
beginning of the story. Only eight of the first grad- 
ers' endings included more than one event, whileat 
least nineteen of the endings of each of the older 
grade levels contained a number of events, 


Recall: Nearly every S recalled most of the ma- 
jor facts in the story, a clear indication that they 
were attending to the task and possessed some un- 
derstanding of the story. The three older groups 
were almost flawless, On the other hand, the first 
graders included almost no details intheir retelling, 
while some of the older Ss mentioned 
nearly every detail. 


Originality: Few of the first gradersavoidedthe 
most banal endings. The third graders often pre- 
sented endings which were lengthy and unique but 
not particularly appropriate, The more talented 
sixth graders presented the most interesting and 
original endings, but the overall performance of 
theirclass did not differ from the ninth graders' per- 
formance. The latter group provided endings of 
limited originality but sensed what was appropriate 
and responded well to the formal demands of the 
stories. Their endings resembled one another, sug- 
gesting that they knewhow such stories should end. 


Sensitivity to Style: Performances were gener- 
ally disappointing. Few Ss ofany age rendered end- 
ings which reflected the distinctive styles of the 
Stories, Accordingto the guidelines described 
above, one S at each ofthe three higher grade levels 
was judged definitely sensitive to style in his 
endings, while two third graders, two sixth 
graders, and three ninth graders were classed 
as possibly sensitive. 


Style sensitivity was much greater in the retell- 
ings, but despite the efforts of the coders to dis- 
count specific word choice, this measure was 
probably confounded to some extent with memory. 
Differences among Ss were vast at each age level, 
this measure being the only one in which the first 
graders did not perform significantly worse than 
the older groups. The ability of the first graders 
to remember certain striking phrases and to recall 
the first line of the story probably contributed to 
the appearance of style sensitivity. T wo first 
graders, four thirdgraders, six Sixth graders, and 
three ninth graders were judged sensitive to style 
in their retellings; two first graders, four third 
graders, three sixth graders, and three ninth 
graders were considered possibly sensitive, 
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TABLE 1 


MEANS, STANDARD DEVIATIONS AND RANGES ON PRINCIPAL MEASURES 


Grade level 

Statistic first third sixth ninth 

LENGTH IN LINES 
Mean 1,33 11,54 16.25 11.37 
S.D. .59 14.12 13.66 7.62 
Range 1-3 1—54 1.5—47.5 6—31.5 

MAJOR FACTS 

(maximum score 12) 
Mean 9.21 11.12 11.67 11.62 
S.D. 1.79 .97 .26 .63 
Range 6—11.5 9—12 11—12 10—12 

DETAILS 

(maximum score 78) 
Mean * 23.29 30.88 24.96 
S.D. 9.88 10.54 12.72 
Range 7.5-44 17.5—54 9.5—48.5 

ORIGINALITY 

(maximum score 6) 
Mean .73 2.81 3.31 3.14 
S.D. .56 1.45 1.42 .84 
Range 0—2 .25--5,25 1—5.25 2.—4.25 

STYLE SENSITIVITY 

IN ENDING 

(maximum score 24) 
Mean .79 3.25 3.75 4.46 
S.D. 1.05 3.01 2.80 4.15 
Range 0—4 .5-10.5 .5—10 5—15.5 

STYLE SENSITIVITY 

IN RETELLING 

(maximum score 24) 
Mean 5.04 8.33 9.33 7.04 
S.D. 5.08 4.10 5.66 6.79 
Range .5—16.5 2—19 


1.5—19.5 0—19 
* Measure inapplicable for reasons cited in footnote 3. 


ж 
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TABLE 2 


ANALYSES OF VARIANCE OF SCORES ON PRINCIPAL MEASURES 


Е scores on various measures 


Age levels 
included 
Length Major Details Originality Style Style 
facts ending retelling 

4 age levels 3.90* 13.20** --- 12.40** 4,03% 1.28 

df = 3, 44 
3 higher age .57 2.39 1.44 .45 .36 .45 
levels 

df = 2,33 
жр. = = „025 


жжр = = ,01 
Note: No other differences significant. 


Moral Stance of the Ending: The youngest Ss 
made no attempt to balance the scales of justice. 
They allowed the powerful forces to triumph com- 
pletely or rewarded good forces without regard to 
the fate of evil, Withincreasingage, Ss were more 
likely to focus on the villian, having him repent, 
yield his power, or be punished. Both sixth graders 
and ninth graders included many reversals of for- 
tune in the narrative, but the ninth graders more 
often presented the kind of moralistic ending tra- 
ditional for fairy tales. 


Aspects of Understanding: Though only direct 
questions can unambiguously establish understand- 
ing, the content of the endings and the retellings 
definitely indicate that every S had some under- 
standing of the stories. That the comprehension 
of younger Ss was incomplete is suggested by the 
briefness of their endings and their failure to take 
into account the forces and power conflicts in the 
stories, While the first graders did not integrate 
suggestive elements of the plot in their endings, 
over one half of the stories of the third graders, 
and over three-fourths of the stories of the sixth 
and ninth graders included some integration of 
pregnant facts or details, Posttest interviews re- 
vealed that only the ninth graders were explicitly 
aware of the style differences in the two plots; 
while this awareness may signal a greater appre- 
ciation of the formal properties of litera- 
ture, it does not imply superior ability in 
recreating that style. 


DISCUSSION 


Explanatory studies are designed as much to 
generate as to test hypotheses. Generalizations 
are accordingly risky, yet in а field as uncharted 
asliterary skill, some attempt to organize and in- 
terpret findings seems justified. 


The typical performance at each age level can 
be characterized. First graders treat the stories 


like a series of strips in a comic, calling for one 
additionalline to complete the message. These 
closing lines are less inappropriate than incom- 
plete, as they do not take adequate account of the 
various forces extant in the stories, The general 
lack of originality contrasts with young children's 
frequent imaginativeness in spontaneous story- 
telling, Probably the structured nature of the task 
restricts the youngster’s creative powers. Under- 
standing is limited, as the young Ss lack sufficient 
personal experience, cognitive complexity, and 
familiarity with literary convention. 


An apt description of the third grader's stories 
is picaresque. Subjects interpret the task as an 
occasion to list a long series of events involving a 
hero. Often these episodes are borrowed from 
other stories and may be inappropriate; yet the 
most gifted third graders relate endings which are 
both inventive and relevant. 


Sixth grade might be viewed as the watershed of 
literary development. Subjects understand the 
stories, select appropriate endings, andareincon- 
trol of syntax and ideas. Superior performances 
in most skills are found at this level. If endings 
are cartoon-like, they are considered; if pica- 
resque, the child maintains control ofthe story's 
drift, An increasingly cognitive orientation is evi- 
dent, as characters “think, ’? ‘doubt,’ and **bar- 
gain"; good and evil are balanced. A majority of 
the children combine the daring inventiveness ofthe 
younger child with the control and direction of the 
older child. Even the less talented seem 
to retain promise. 


Self-consciousness and self-criticism often 
combine to hinder the literary productivity of 
the oldest group. Most are already ‘‘professional’’ 
in their approach, competent in executing the task, 
but less imaginative, less alert to details, andless 
able to preserve stylistic nuances than the sixth 
graders, The oldest group excels in psychological 
insight and in the ability to discuss literary prop- 
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erties; but Ss tend to reworkthe material into their 
own or their peers’ way of speaking. The advent 
of formal operations, which enables the child to fit 
different contents into the same operational struc- 
ture, paradoxically seems to diminish the child’s 
ability to remain within the style, rhythm, or tone 
created by an author. Thus, among the various 
theoretical trajectories considered, the evidence 
favors two positions; the spurt in capacity follow- 
ing the ages 5-7 and the diminishing of sensitivity 
to language following the onset of puberty, perhaps 
due to the advent of formal operations (5, 6). No 
evidence in favor of gradual improvement or ado- 
lescent superiority was found, 


Though these age differences seem genuine, and 
consistent with other developmental findings, the 
ranges withinage groups are even more striking. 
At every level there are children who perform like 
the average first-grader, one or two who perform 
at the level of the most talented child. Further- 
more, the various skills cluster; children with 
good memories are also the most original and the 
most sensitive to style, Though it is possible that 
the exercise has merely tapped general intelligence 
or task ability, it seems more probable that a small 
percentage of the population is especially gifted in 
the verbal-literary area and that this group is al- 
ready identifiable at an early age. 


These two conclusions about literary skill may ap- 
pear inconsistent, Onethe one hand, eachage group 
can be separately characterized; on the other, a 
small group can be isolated as especially talented 
from the first. Further paradoxes are also raised 
bythe findings, For example, the ninth graders 
Seem less skilled than the sixth gradersona range 
of tasks, yet clearly the most developedskills will 
belong to much older individuals, Perhaps some 
progress toward resolving such puzzles can be 
made if one assumes a universal Sequencetoliter- 
агу development which will be interrupted if the 
Stages do not proceed at a Sufficiently rapid rate, 
According to this View, all individuals would pass 
from one-line episodes to picaresque creations to 
Some capacity at appreciating Styles, handling 
thematic materials, and producing imaginative 
works. Yet if the childhas reached anadolescence 
by the time he passes through these stages of lit- 
erary development, his heightened critical faculty 
and self-consciousness might make him resist fur- 
ther explorations and fall back on formulas and 
“safe” approaches. Only the child who, because 
of superior verbal skills, intelligence, or tutelage, 
had passed through the sequence more rapidly 
would be able to achieve sufficient skill and mas- 
tery of the medium during the preadolescent years; 
then when he became increasingly critical, he would 
be less likely to find his works unacceptable and 
could continue his literary development, Studies of 


progress through the 


influences is a crucial question for which longitudi- 
nal studies would a 


results suggest that elementary school teachers 
should offer their stu dents maximum opportuni- 
ty to engage in literary creativity, since the height 


of their potential may well have been reached and 
passed before formal instruction has ever begun. 


FOOTNOTES 


1, Authors’ Address: Department, of Social 
Relations, Harvard University, Cambridge, 
Massachusetts 02138, 


2. We would like to thank Mr. R. Brown of the 
Newton Public School System for help in ar- 
ranging the study andthe staffs of Day Jr. High 
School and the Underwood School for assis- 
tance in carrying it out. The study was sup- 
ported by Project Zero and a grant from the 
Department of Social Relations. We are also 
indebted to Professors Roger Brown and 
Marshall Haith for their incisive comments 
on an earlier draft. 


3. The version heard by the first graders was 
somewhat shortened and simplified, This pro- 
cedure has no apparent effect on any of the 
measures except the one probing memory for 
details, which has accordingly been elim- 
inated. 
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ABSTRACT 


The Wherry-Doolittle proceedure has been used for over 30 years to reduce the number of variables ina 


multiple correlation. This paper descr 
variables in the cases of canonical corr 


ibes techniques for obtaining the same kind of reduction of number of 
elation discriminant analysis and multivariate analysis of variance. 


! Statistical tests comparable to those used іп the Wherry-Doolittle procedure are cited, 

SUPPOSE WE are given two sets of variables X 2. Choose the predictor with the largest cri- 
and Y with the objective of predicting X from Y. The terion correlation as the most important 
prediction is best accomplished by means of the re- of the predictors (say, predictor number 
gression equations attendant upon the canonical cor- 1). 
relations (9). However, it is sometimes desiredto 
reduce the number of variables in the Y set without 3. Calculate the partial correlations between 

" disturbing the predictability greatly; one method of the remaining predictors and the criterion, 

Ф doing that reduction is the topic of this paper. removing the chosen predictor( s). 

Similar problems exist in reducing the number of 4. Choose as the next most important predic- 
variables required to effect discrimination among tor that one of the remaining predictors 
groups or levels of treatment in a multivariate anal- which has the largest partial correlation 
ysis of variance. A study of the discriminant prob- with the criterion. 
lem was undertaken by Weiner and Dunn (10) who 
used four techniques of battery reduction including 5. Compute the multiple correlation between 
stepwise regression (but not the Wherry-Doolittle the chosen predictors and the criterion. 
procedure) and evaluated the efficacy of the tech- Determine whether the latest addition to 
niques through probability of misclassification. the chosen predictors adds substantially 

to the previously obtained multiple corre- 
THE WHERRY-DOOLITTLE PROCEDURE IN lation. If a substantial increase has been 
MULTIPLE CORRELATION made, repeat steps 3, 4, and 5. 
Г 
The calculation procedure for а Wherry-Doolittle 6. When addition of a new predictor fails to 


battery reduction is described in Garrett’s text (3). 
A short foray into algebra suffices to show that the 
technique is very simple in its logic, though complex 


make a substantial increase in the multi- 
ple correlation, the procedure is termi- 
nated. 


in its calculation. The logic proceeds thusly: 
A statistical test is available for determining the 


importance of adding a given predictor to the set. 


| 1. Calculate all the correlations between the 
As stated by Rao (7:225), this is the test of the 


predictors and the criterion. 
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partial correlation between the last chosen predictor 
and the criterion removing the effect of the previous- 
ly chosen predictors. 


This procedure has been criticized by many users 
because the final selection of predictors is subject to 
considerable sampling variation. This shortcoming 
of the procedure can only be overcome by alternative 
information or shrewd guesswork by the researcher. 


Several articles have been written on the vagaries 
of battery reduction procedures in multiple correla- 
tion. Burkett (1), Herzberg (5), and Rock and oth- 
ers (8) are recent examples. Herzberg discusses 
battery reduction in canonical correlations but man- 
ages to resolve the multiple criteria to a single cri- 
terion before invoking battery reduction Processes. 


ALTERNATIVE CRITERIA FOR CHOICE OF VARI- 
ABLE 


It is worthwhile to note alternatives to the deci- 
sion rules: to wit, Step 4 of the procedure where one 
chooses the next variable to be added to the predictor 
set. The procedure calls for the inclusion of the 
variable with the largest partial correlation with the 
criterion. If we denote this partial correlation as 
Tic» We may also choose that variable which makes 
the greatest reduction in the predictable variance, 
that reduction being R? - гіс. Or, we may also 
choose that variable which has the largest F ratio in 
the significance tests for the correlation betwe en 
variables and criterion, гіс, since F = гі0/1- гі 


X constant. s 


An alternative calculation routine is also avail- 
able. Wherry's calculation procedure in step 4 is 
based on the Gauss-Doolittle method of matrix in- 
version which is complex and somewhat difficult. 


ification of the Square root method of factor analysis 
(4:102). 


Suppose the matrix of correlations between the 
criterion (c) and the predictors (Шел2% Жш , n) 
are arranged in a matrix as in Table 1. 


Also, suppose that predictor 1 has the largest 
correlation with the criterion. 


tor 1 can be removed from R by the following strat- 
egem. Denote C, as column 2 of R containing the 
correlations of predictor 1 with all other variables. 
Form the matrix Сә” C$ and subtract it from R: R- 


TABLE 1 


CORRELATIONS BETWEEN PREDICTORS AND 
CRI 


CC} = R*. This matrix has 0’s in column 2 and 
row 2. The other elements are: off diagonal rij- 
Түгі is the ij -th entry and on diagonal the i -th en- 
try is 1 - rfj. Dividing each entry by the square 
roots of its row and column diagonal gives its value 
as 


тіс anj 
---ы ilj _ 
М "5 rn ” т-ту 


which is the partial correlation between variables i 
and j adjusted for predictor variable 1. 


It might also be noted that this calculation proce- 
dure does not require that the original matrix of 
predictor-criterion correlation, R, be of full rank 
as does the Wherry-Doolittle procedure. 


GENERALIZATION TO CANONICAL CORRELA- 
TION 


Suppose we are given two sets of variables X and 
Ү with intercorrelations, 


хх ху ә 
yx Ry (c) 


where the within battery correlation matrices, 
and Ryy; are of full rank. The canonical correla- 
tions, р>, between X and Y are determined from 
either of the determinantal equations 
-1 P a 
LN -aR = 0 (a) 
or 


-1 
Ry ey тақ | = 0 (b) 
There are a few aspects of these equations that are 
seldom discussed but warrant review here. Using 
equation a we note that 


(1) the trace, tr( Ryy) is the “total” variance 
of the variable set т, 


(2) the diagonal entries of Вуху аге 
the multiple correlations of each of the 
Y variables on the variable set X, 


(3) (Вуха, у) is the variance due to А 
hypothesis (or “between”? ) and is the 
variance of the variable Set Y that can 


be explained by regression from the vari- 
able set X, 


be determined from the diagonal of 
RyxRxxRxy as mentioned above. This choice is con- 
sistent with the Wherry-Doolittle procedure since 
(a) it chooses the variable which makes the great- 
ion in the predictable variance, 
a Rxy), and (b) it chooses that variable 
Which has the largest F ratio in the tests for the 
(multiple) correlations, Ri X, between the ү; and 
X because F =RY,x/1 - вух х constant. 


` 


With this in mind we can formulate rules for bat- 
tery reduction in canonical correlation. 


1. Calculate all the multiple correlations be- 
tween the predictors, Y, and the criteri- 
on, X. 


2. Choose that predictor with the largest 
multiple correlation with the criterion, 
X, as the most important predictor, say, 
Y; 


3. Calculate the partialled multiple correla- 
tion between the remaining predictor and 
the criteria. This can be done by reduc- 
ing the correlation matrix R (see equation 
c) as in the suggested alternative calcula- 
tion procedure obtaining the matrix of par- 
tialled correlations, 


Re R* 

КФ = ш ху 
gh. RE 

ух УУ 


then formulating Rf; ( Ri) "ing which 
has the desired partialled multiple cor- 
relations on its diagonal. 


4. Choose as the next most important predic- 
tor that one of the remaining predictors 
which has the largest partialled multiple 
correlation with the criteria. 


5. Compute the canonical correlations be- 
tween the chosen predictors and the cri- 
teria. Determine whether the latest ad- 
dition to the chosen predictors adds 
substantially to the previously obtained 
canonical correlations. If a substantial 
increase has been made, repeat steps 3, 
4, and 5. 


6. When addition of a new predictor fails to 
make a substantial increase in the canon- 
ical correlations, the procedure is termi- 
nated. 


A statistical test for the decision to terminate is 
available and is cited by Rao (7:467) in the form 
used in multivariate analysis of variance. A more 
complete discussion of this test follows under the 
section on multivariate analysis of variance. 


The problems of reducing the X battery is identi- 
cal to that of reducing the Ү battery. 


GENERALIZATION TO MANOVA AND DISCRIM- 
INANT ANALYSIS 


In multivariate analysis of variance and discrim- 
inant analysis (a one-way MANOVA), the problem 
of battery reduction is almost identical to the prob- 
lem in canonical correlation. The logic is identical 
up to the slight differences in the mechanics of ob- 
taining the solution. Once the relationship between 
the mechanics of the two solutions is seen, the sim- 
ilarity of the battery reduction procedure is appar- 


ent. 


In a MANOVA or discriminant analysis the solu- 
tion originates from the determinantal equation 
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(а) 


|55, - ass,| = 0 


where 55ң and SSg represent the sums of squares 
for hypothesis and error respectively. 


For convenience sake, let us denote the variables 
which are measured as the Y variables. 


To obtain the sums of squares for hypothesis it 
is often convenient to set up p dummy variables 
(dummy parameters, design parameters, or fixed 
variables) when there are p + 1 levels of the exper- 
imental design (or p + 1 groups in the discriminant 
analysis). Let us denote these variables as the x 
set. Next the sums of squares and cross products 
of the X variables and the Y variables are obtained 
as the (partitioned) matrix 


The term SSg is then obtained by calculating 


-1. 
= ү! 
ss, = Y xxix) WT (e) 


the SSg is chosen (usually as some residual) and 
the determinantal equation d is solved. 


Equation d can be manipulated as follows. Sub- 
stituting e into d we obtain 


pexaey) hex - ass,| = 0 
Substituting и/ (1 - ш) for А we can obtain 
рохои) tex - (55, + ү'х(х'х)7%х'ї)| = о (£f) 
The term SSg + Y X(X' X) 1X' Y is recognized 
as the total sum-of-squares matrix, SST. (Note 


that in a discriminant analysis this is Y' Y of the 
S.P. matrix.) Rewrite f as 


роххи) 2и - vss, = 0 
Designating от as the diagonal matrix of the square 


roots of the diagonal elements of 55т we can pre- 


and post-multiply by ол and obtain 


[е ехе әзер = veg 88,05 | “о 
тұ! -1 
Now © T7 T is a correlation matrix, say Күү, 
in the ‘total’? variance of the variable Y. 
If we also designated ox as the diagonal matrix of 
the square roots of the diagonal elements of X' X we 


may rewrite the equation again as 


-1 
Ж мы ТАа ААА 2 аа СҢ е 
log Y'Xe. (oy X'Xoy ) oy x You - нор 85606 [=o 


It is immediately apparent that “др лы i5 the 


correlation matrix for the X variables, say Rxx, 
and that -1,,, -1 is a cross correlation matrix, sa 
от 1", » Say 


Ryx. The equation can be now written as 


Rr - myl = 0 
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This result shows the relationship to canonical cor- 
relations. 


What remains to be shown is that the values on 
the diagonal of Ry hao tae are “multiple correla- 
tions” which have F ratios which are univariate F 
ratios of each of the Y variables in the analysis.1 

Consider a single variable y of the set Y. The 
hypothesis sum of squares for y is y'X(X'X)-1X'y- 
557. if ss% is the error sum of squares for y, the 
total sum of squares for y is 55у, +SS% = (248 Now 


the diagonal entry of ov} Y'x(X'x)x'Yo l isa = 


y Y 
суу X(X' X) Ix y1/oy which reduces to 
у y 
5: 55у 
005) = y y 
y 55. + E 


If d is the multiple correlation attached to the uni- 
variate F ratio for y then d/1 - d - df (E)/df(H) is 
that F ratio.? Now 


YS 
a атк) 85/1957 + 55 


у 
j k id . &f(E) _ 98g . (Е) 
1-a'ar(H) 1 


„жий ы? уу df(H) у ағ(н) 
55/155 + 551] 55, 


which is the univariate F ratio for variable y. 


By this means we see that the set of rules for 
battery reduction in canonical correlation apply di- 
rectly to discriminant analysis and MANOVA. The 
calculations for MANOVA battery reduction are most 
easily carried out when the data manipulation is car- 
ried out as if the problem were a canonical correla- 
tion, but the result is the desired one. 


A STATISTICAL TEST FOR MANOVA AND CA- 
NONICAL CORRELATION BATTERY REDUCTION 


Step 5 of the battery reduction procedure calls 
for a determination to be made as to whether a new 
predictor adds substantially to prediction obtained 
by previous selections. A statistical test for addi- 
tional information from additional predictors has 
been developed by Rao (7:467) to be used in discrim- 
inant analysis. As has been observed in the previous 
section, canonical correlation and MANOVA are sim- 
ilar procedures and the statistical tests applicable to 
one apply to the other. 


Without pursuing Rao’s logic here it should be 
noted that the test reduces to an analysis of the 
relationship between the X variables and the chosen 
Y variable when the previously chosen Y variables 
are treated as if they were covariates. That is, in 
canonical correlation problems, calculate the sig- 
nificance test for the partialled multiple correlation 
between the criteria X and the chosen predictor with 
the previously chosen predictors as covariates; and 
in MANOVA, calculate the Significance test for the 


chosen Y variable treating the previously chosen Y 
variables as covariates. 


SHORTCOMINGS AND PRE CAUTIONS 


The Wherry 


-Doolittle procedures has b la- 
bored often be. р аз been bela. 


cause the choice of reduced battery de- 


pends upon the vagaries of the sample chosen: the 
repetition of the process on a new sample may yield 
a different reduced battery. This fault of the pro- 
cedure is certain to be true when it is applied to ca- 
nonical correlation and MANOVA. In the application 
to MANOVA, the difficulty may not prove to be as 
severe as in multiple correlation because the vari- 
ables one is trying to predict are group membership 
variables (or design parameters) which are usually 
determinable without “error of measurement”? and 
a MANOVA involves only sampling variation in the 
Ү or predictor variables. 


In canonical correlation it would appear that the 
problem is even more unstable than in multiple cor- 
relation. In the case of multiple correlations, not 
only are the predictors subject to sampling varia- Р 
tion, but the one criterion variable is also. In ca- 
nonical correlation, the criteria are numerous and 
the sampling variations among them are compound- 
ed. For this reason, it may be that battery reduc- 
tion in canonical correlation is almost a useless pro- 
cedure because of sampling variation. Meredith( 6) 
has devised a procedure for correcting canonical cor- 
relation data for error of measurement in the vari- 
ables and has found that the correction procedure 
greatly altered the results of analysis. This is a 
Strong suggestion that battery reduction in canonical 
correlation may be an almost useless procedure. 


COMMENTS ON EFROYMSON'S STEPWISE RE- 
DUCTION AND THE STEP-UP PROCEDURE 


Efroymson’s Stepwise procedure (2) differed 
from the Wherry-Doolittle procedure only by per- 
mitting variables to be dropped. This requires that 
a single step be inserted in the regimen after step 
5. Before recycling steps 3, 4, and 5 to consider 
addition of another variable, compute the partialled 
correlations for each of the chosen variables holding 
all other chosen variables as covariates. The 
smallest of these partialled correlations is tested 
for its contribution to the criterion by the test cited 
by Rao. If a decision is made to drop one variable, 
it is eliminated and the new ( reduced by one) set of 
chosen variables is reexamined for deletion of an- 
other variable. If no variable can be dropped, the 
process recycles through steps 3, 4, and 5. 


$ 


The step-up procedure is the process where the 
entire set of predictors is examined to determine 
whether one or more variables can be eliminated 
for lack of contribution to prediction. This is the 
same process as for deletion of chosen variablesin 
the Efroymson procedure if one starts with the en- 
tire set of variables as “chosen” variables. In 
short, consider the Partialled multiple correlation 
between each predictor and the criteria with all oth- 
er predictors as covariates. The smallest of these 
partialled multiple correlations is tested as cited ру # 
Rao апа a decision made about dropping the variable. % 
If a decision is made to drop the variable, the ге” 


duced set of variables is reexamined for deletion of 
another variable. 


A FORTRAN IV subroutine is available from the a 
author to perform the techniques discussed in the раре 
FOOTNOTES UP 


t 
1. This isa necessary argument because the in- 
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Book Reviews 


(continued from page 41) 


New paradigms in education, as in other fields, create resistance. In later chapters, Borton considers 
some ofthe particular resistancesto process education: parents may fearthat their children will not learn academic 
skills; teachers and students may have long developed expectations about educational goals and methods which 
run counter to those necessary for involvement in a process orientation. A more subtle resistance, and one 
never made explicit by Borton, results from the fact that an affective education of the sort described by Borton 
undoubtably conjures up images of psychotherapy. Borton alludes to this potential resistance when he attempts 
to reassure the reader that he intends process education to be education and not therapy. 


The image of therapy has a number of elements. To some it connotes pathology and mysterious therapeu- 
tic processes. Borton is presumably attempting to remove these connotations andthereby overcome resistance. 
It may be, however, that public resistance stems from'other elements of therapy: attention to feelings and the 
goal of personal change. Ina culture unaccustomed to expressing feelings or communicating about them, 
Borton's views will not be accepted easily by educators or parents. Furthermore, the goal of personal change 
requires skills of teachers that they have not been previously taught. Opposition to the therapy-like aspects of 
process education, then, may be founded on elements other than those which can be removed through simple re- 
assurance. The problem of overcoming resistance, if this analysis is correct, is one of legitimizing t he ex- 
loration of feelings and of helping educators to increase their own interpersonal skills in promoting change. 


Education and therapy have always shared some common elements. The attention to feelings found in new- 
er educational proposals and the increasing use of the definition of learning as behavioral change are adding to 
the Similarity. Given this similarity, it might be appropriate to meet the issue head on and to explore waysin 
which teachers and students can be helped to live more effective, affective lives. Borton urges that schools in- 
terested in process education provide their staffs with sensitivity training. Such trainingallows teacherstoex- 
perience, as “students, ” interpersonal and affective learning, and it may partially prepare teachers to under- 
stand and promote the learning of their own students. Beyond sensitivity training, there is a need for teachers 
to monitor their own and their students' feelings about the educational program. The systematic collection and 
examination of data may be the best way to ensure that teachers and students learn. 


Borton's book concludes with an appendix containing reference to articles, books, and films that serve to 
continue the reader's own educational process. Borton's book, like his model of learning, is inductive. Hebe- 
gins with his own experiences, cognifies them, and applies them to an educational program, He leaves the 
reader with suggested materials providing for further experience and learning. 


Steven R, Asher, Reviewer 
Instructional Research Laboratory 
University of Wisconsin-Madison 
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ORGANIZATIONAL CLIMATE AND FREQUENCY | 


OF PRINCIPAL-TEACHER COMMUNICATIONS 


IN SELECTED OHIO ELEMENTARY SCHOOLS’ 
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ABSTRACT 


CENTRAL TOa changing system of interac- 
tion is the process of communication. Communica- 
tion aids or hinders goal achievement within the or- 
ganization and it affects £roup membership (3:534). 
Because the frequency of principal-teacher commu- 
nications in the public elementary school might have 
been a determinant in the school's organizational 
climate as well as in its teacher esprit ( morale), a 
hypothesis was tested, namely: that the total fre- 
quency of oral and written communications between 
the principal and his faculty collectively, as well as 
downward from the principal to the faculty and up- 
ward from the faculty to the principal, were signif- 
icantly (р — .05) related to the nature ofthe School's 
organizational climate as wellasthe faculty's esprit. 


METHODOLOGY 


The nature of a school's organizational climate 
and the degree of its faculty's esprit can be deter- 
mined through the work of Halpin. Describing the 
School's organizational climate as the organization- 
2l personality of the School, Halpin through factor 
analysis derived six profiles or prototypic organi- 
zational climates for the elementary school. These 
profiles, moreover, arranged themselves along а 
continuum from open to autonomous, controlled, fa- 
miliar, paternal, and closed prototypic climates. 


бы e^ 9 
amely school organizational climate, explaining the "'natur! 


Thirty-seven соорега! 


Э һе 
The basic assumption was us shove con- . 


n describi? 


Three parameters were also discovered i rinci- 


the social interaction between an elementary P a 
pal and his faculty: authenticity, satisfactio ' define: 
leadership initiation. The first, said Halp jrinciP 
the **openness" of the behavior between UNS con- 
and his faculty; the second, ‘the aftainmer shment 
joint satisfaction in respect to task собор, with 
and social needs”; and the third, the latitu initiate 
which the principal as well as the faculty і 
leadership acts. ith 
as Ww 
In this investigation, the primary concern вас" 
the second conceptualization, “the conjoin social 
tion in respect to task accomplishment and sprit, put 
needs." For the faculty, this resulted in wn orga 
it was not the sole determinant in the воб ог ч 
nizational climate. Eight behavioral pate со” 
belonging to the principal and four to the fa school’s 
varying among themselves, identified th 2 орет 
organizational climate as being one ọf the al, 
autonomous, controlled, familiar, patern paviors 
closed. Halpin labeled the four principal be con^ 
as thrust, production emphasis, aloofness, esprit, 
Sideration and the four faculty behaviors (4) i 
intimacy, disengagement, and hindrance ( ппаїгё 
Organizational Climate Description Question climat 
OCDQ) identifiedthe school’s organizationa 


through these eight subdimensions. 
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Data on the viability of the construct, school or- 
ganizational climate, were contained in the sources 
listed in Table 1 (5 ). Reliability data have been report- 
ed by Halpinand Andersonas follow in Table 1and 2. 


The acceptable reliability of the OCDQ was again 
demonstrated by Anderson (1:81) who in a test- 
retest Pearsonian r correlation, as well as an odd- 
even respondent Pearsonian r with a Minnesota sam- 
ple, obtained the reliability coefficients (P < .01) 
shown in Table 2. 


The Principal's Data Sheet ( PDS) was designed 
to obtain the frequency of various types of oral and 
written communications between a principal and his 
faculty, but for this investigation, the average ofthe 
total frequency of communications over a 20-day pe- 
riod in each school within the sample became the sole 
measure. By type of communication in a pilot study, 
item reliability coefficients were significant at least 
at the .05 level, while the odd-even respondent reli- 
ability coefficient for the whole PDS was .82, signif- 
icant at the .01 level (6:37-39). 


The population consisted of the 3,107 elementary 
schools listed in the 1966-67 Educational Directory 
of the State of Ohio (7). Proportionate random sam- 
pling by type of school allowed the mailing of seventy- 
two requests to city schools, sixty requests to county 
schools, and eight requests to exempted village 
schools. Fifty-two principals replied that they were 
willing to cooperate. Thirty-seven principals actu- 
ally completed the PDS, the other fifteen failing to 
respond to a tracer letter after the instruments had 
been mailed to them. 


TABLE 1 


HALPIN’S ESTIMATES OF INTERNAL CONSIS- 
TENCY AND OF EQUIVALENCE FOR THE EIGHT 
OCDQ SUBDIMENSIONS (5:49) 


Split-half Correlation Communality 
Coefficient Between Estimates 
of Reliabil- Scores of for Three- 
ity, Cor- the Odd- Factor Ro- 
rected by Numbered tational So- 
the Spear- and the lution 


man- Even- 
Brown Numbered 
Formula? Respondents 
in Eac 
OCDQ School 
Subtests (N-1,151) ( N=71) (N-1,151 ) 
Disengagement 73 .59 .66 
Hindran: .68 .54 .44 
Esprit 15 .61 73 
Intimacy .60 .49 .53 
Aloofness .26 .76 ^2 
Production 
Emphasis .55 413 .53 
Thrust .84 .15 .68 
Consideration .59 .63 .64 


Consideration | .99 _ — ә 


aEstimate of internal consistency. 
stimate of equivalence. 
CThese are lower-bound, conservative estimates of 


equivalence. 
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TABLE 2 
ANDERSON'S RELIABILITY COEFFICIENTS 


Pearsonian r Correla- 
tion of Odd-Even 


Test-Retest Pearsonian r Respondents 
Disengagement +.567 +.541 
Hindrance +.458 +.791 
Esprit +.805 +.685 
Intimacy +.653 4.668 
Aloofness +.196 +.708 
Production 

Emphasis +.787 +.692 
Thrust +.504 +.763 
Consideration +.805 +.556 


ы o 


Each cooperating principal was sent ten copies of 
the OCDQ and asked to distribute them randomly 
among his faculty. The percent of return by school 
ranged from 70 to 100 with the exception of three 
schools. By thus sampling generally 50 percent or 
more of the eligible faculty population within each of 
the thirty-seven schools, a high degree of precision 
could be attained in inferring to the whole faculty of 
each school (2:3). Forthe totalsample, 310 OCDQ’s 
were returned of the 645 sentto ће cooperatingprin- 
cipals; this represented a 47 percent response for 
the total sample. 


Of the thirty-seven schools in the sample, twenty- 
one were city schools; thirteen, county schools; and 
three, exempted village schools. No discernible rea- 
son could be given for the fifteen principals who failed 
to reply to the tracer letter other than that eight were 
from city schools, six from county schools, and one 
from an exempted village school. That these princi- 
pals failed to reply may have biased the sample as 
well as the procedure employed, whereby each coop- 
erating principal selected the teachers to whom he 


passed out the OCDQ’s. 


The nonparametric Spearman (rho) rank corre- 
lation coefficient was selected as the main statistic 
for it was a distribution free statistic and had about 
a 91 percent efficiency of the Pearson correlation 
coefficient in rejecting а null hypothesis. Since the 
sample, as indicated above, may have become biased, 
the rho correlation coefficient seemed to be themore 
appropriate statistic to apply to the obtained data. 
But, in addition, although the OCDQ itself was a 
summated, Likert-type, equal interval scale, the 
PDS, as constructed, did not meet the intervalscale 
requirement, but involved ordinal measurement in- 
stead. Therefore, again the Spearman rho, not the 
Pearson r, seemed to be the more appropriate cor- 
relational statistic (9:202-213). 

RESULTS 

Table 3 shows the results of Spearman rank cor- 
relations between the frequency of total principal- 
teacher communications, the frequency of principal 
downward communications to the faculty, the fre- 
quency of teacher upward communications to the 
principal, and the OCDQ esprit mean scores. The 
rho correlation by school between the frequency of 
total principal-teacher communications andthe OCDQ 
esprit mean scores was .21, between the frequency 
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TABLE 3 


SPEARMAN RANK CORRELATIONS 


SSS 


Frequency of Т‹ 
Teacher Comm 


Frequency of Р: 


Communications to the Faculty .28 


Frequency of Teacher Upward 
Communications to the Principal .31 


* None of the above rho's significant at the .05 lev- 
е1 of acceptance on one-tailed test, 


of principal downward communications tothe faculty, 
-28, and between the fre 


test. 


The sample yielded six open, five 


three controlled, 


eighteen closed climate schools. 


controlled, famili 
keeping with the v. 


We have said that these climates have been 
ranked in respect to openness versus closed- 


TABLE 4 


RMAN RANK CORRELATIONS By OPEN OR 


SPEA 
CLOSED SCHOOL 


*None of the above 
of significan 


E ND THE OCDQ ESPRIT MEAN 
Open Climate Schools Tg--.09* 
Closed Climate Schools 16= .27 


Се оп a one-tailed test. 
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ness. But we full 


OCDQ Esprit 
Mean Scores 


otal Principal- 
unications .21* 


rincipal Downward 


DISCUSSION 


data? 
autonomous, 
no familiar, five paternal, and 


words”? (4:253). 


ar, and paternal. This is also in 
iew of Halpin and Croft: 


otherwise, 
If we are looking for Іа: 
then our Concepts must 


Operations, or тает: 
1орїса1 realities 


CLIMATE BETWEEN THE FRE- 
AL PRINCIPAL-TEACHER COM- 


(N-6) 
(N-18) 


rho's significant at the .05 level 


y recognize how crude this 
ranking is. As is the case in most methods 
we are much more con- 
es described at each 
we are about those de- 


of ranking or scaling, 
fident about the climat 
end of this listing than 
Scribed in between (5: 
Table 4 shows the results. The rho between fre- 
teacher communications and 


With no significant correlational findings (p < 
quency of principal-teacher 
communications and teacher esprit, nor betweenthe 
frequency of principal downward communications to 
his faculty and teacher esprit, nor between the fre- 
quency of teacher upward communications to their 
principal and teacher esprit, nor between the total 
frequency of principal-teacher 
teacher esprit in the open andc 
what inferences could be safely 


-05) between the total fre 


communications and 
losed climate schools, 
drawn from these 


Perhaps principal-teacher communications might 
involve characteristics other than merely oral or 
written attributes. To hold that all communication 
was entirely verbal communication, said Halpin, was 
perhaps fallacious for “actions spoke louder than 


Even with the low level of overt behavior herein, 
that is, the frequency of oral or 
ther by the principal or his fac 
differences were obtained, If overt 


written behavior ei- 


Ought to communicate more? 
thus human relations and hum 


an morale would ipso 
facto improve, These findin 


gs here might suggest 


communicate more" might 
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FOOTNOTES 


This is a shorter version of the author’s unpub- 
lished doctoral dissertation under the same 
title, University of Akron, 1969, and also a 
part of Research Grant OEG-0-8-08005-3715, 
**An Analysis of the Relationship of the De- 
gree of Satisfaction of Teachers Within Cer- 
tain Ohio Schools with the Formal Communi- 
cation of Their Principal," Bureau of 
Research, Office of Education, U.S. Depart- 
ment Health, Education, and Welfare, Region 
V, Chicago, Illinois, 1969. А versionofthis 
paper was also presented under the same ti- 
tle at the annual meeting of the American 
Educational Research Association, Los An- 
geles, California, 1969. 
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ABSTRACT 


Norms, reliabilities, and validities for f 
el are presented. In testing the instrument on a random sample of men and women, it was found that peT 2 
both sexes emphasized knowledge of Society, human condition, natural world, past, and artistic heritage in Ша 


order. Itis suggested that the Educational Values Inventory may be able to provide useful information to educa- 
tional planners when developing institutional goals. 


IT SEEMS tenable to Say that school systems 


A other, published in 1953, was prepared by Kearney 
traditionally have lacked effective procedures for long for the Mid-Century Committee on Outcomes in Ele 
range planning of activities, evaluating accomplish- mentary Education (9). These formulations of 60219 
ments, and reporting results to constituents. That are general in nature. We assume a school sySt€ 
this situation can no longer be tolerated is evidenced attempting to develop its own goals will use suc 

by the penetrating questions concerning the goals of Statements as guides in the development of a unique 
particular school systems, the processes of goal se- Set of goals which are relevant for one particular 
lection, and the degree of goal attainment being asked town or city at one point in time. This is not a 5107 
by students, teachers, parents, and other taxpayers. ple process and often the goodness of fit is not out- 
The statement of goals is of central importance in a standing. 


School system because all decision making must re- 
flect the constancy of purpose and direction made 
possible by such a rationale. During the last decade, 
the effective use of the planning-programmin g- 
budgeting system ( PPBS) in various types of orga- 
nizations has encouraged optimism among those con- 
cerned with educationalplanning(6). This optimism 
seems reasonable so long as new approaches to the 
process of goal selection are developed—for the 
Success of PPBS or any other long range planning 


Scheme depends heavily onthe effectiveness of goal 
development. 


It is the purpose of this paper to describe an in- 
Strument which has been developed to measure the 
educational values of those individuals who are con- 
cerned with school systems. Goals which are unique- 
ly appropriate for a particular school system are 
more likely to be developed when the values of all 
constituents are measured and considered. The de- 
termination of the values of those serving andthose 
served by school systems is so vital to meaningful 
goal development that it should not be left to chance- 


The emphasis here is on the measurement of ed- 
ucational values because the particular emphasis of 
а statement of goals appears to reflect the value ori- 
entations of the individual or group preparing them. 
For example, the value held by the NEA of develop- 
ing the whole child so that he may fit into society 19 
evident behind the four major categories of the Edu 
cational Policies Commission’s statement on goals: 


The educational literature abounds with state- 
ments of goals. In general, school systems have 
gravitated toward two formulations. One appeared 
in 1946 as a result of the efforts of the Educational 
Policies Commission which was appointed by the Na- 
tional Education Association (NEA) and the Ameri- 
can Association of School Administrators (4). The 


" А Б а- 
the five scales of an instrument designed to measure Bruner’s mo 


——————— ад 
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the objectives of self realization, the objectives of 
human relationship, the objectives of economic ef- 
ficiency, and the objectives of civic responsibility 
(4). It is only logical that a group oriented toward 
social psychology would emphasize the relationships 
among groups of people, whereas those interested in 
clinical psychology would stress individual adjust- 
ment. 


According to Allport, a value is a ‘‘belief upon 
which a man acts by preference" (1:454). Because 
values are relatively resistant to change, the wayan 
individual behaves now and at some future time rests 
to a large degree on his personal values (1,2). We 
have developed the Educational Values Inventory (8) 
to measure a S's educational values. Its usefulness 
in goal setting rests on the assumption that these 
values are relatively stable and are reflected in his 
behavior. 


THE INSTRUMENT 


In his essay ‘‘After John Dewey, What?" (3), 
Bruner presents some thoughts which are relevant 
to the area of goals. Based on his suggestions, it 
appears that it is the school's responsibility to bring 
about cognitive and affective changes in the student's 
behavior in the following areas: 


1. The natural world. This area involves 
the student's understanding of the physi- 
cal sciences and geography. 


2. The human condition. Included in this 
domain is the student's understanding of 
himself—of his personality, interests, 
and attitudes. 


3. The nature and dynamics of society. 
This realm consists of developing the 
student's awareness of and respect for 
such areas as (1) the feelings, opinions, 
and rights of individuals; (2) the values 
held by other people and other societies; 
(3) economic and political structures; 
and (4) civic responsibility. 


4. The past. This domain involves the de- 
velopment of the student's understanding 
so that it may be used in experiencing the 
present and aspiring to the future. 


5. The products of our artistic heritage. 
This dimension relates to the student's 
understanding and appreciation of art, 
music, poetry, and other creative pro- 
ducts. 


Since these areas can be understood only when the 
student is adept in both language and mathematics, 
Bruner suggests that these two tools must have a 
central place in the curriculum (3:121-122). 


The Educational Values Inventory contains two 
sections. In the first part, the individual ranks from 
most important to least important the five areas of 
the curriculum noted in Bruner's model. These five 
areas which are described above are 


1. knowledge of the natural world, 
2. knowledge of the human condition, 


3. knowledge of the nature and dynamics of 
society, 

4. knowledge of the past, 

5. knowledge of our artistic heritage. 


In the second section, four concrete examples of 
each of the five areas were developed. These are 


1. knowledge of the natural world: 


a. the reasons for changing weather 
patterns 

b. the functions of the human circula- 
tory system 

c. magnetic fields 

d. the role of plantsinthe environment 


2. knowledge of the human condition: 


why he enjoys working with his hands 
why he feels happy 

why he feels angry sometimes 

why he dislikes a particular person 


Boge 


3. knowledge of the nature and dynamics 
of society: 


a. the feelings of a fellow student 

b. what it is like to live in the city 

c. the values of the people in develop- 
ing countries in Africa 

d. the way governmental organizations 
operate 


4. knowledge of the past: 


a. the long term effects of Greek civi- 
lization 

b. Thomas Jefferson’s role in the colo- 
nial period 

c. the reasons behind the early Scandi- 
navian explorations 

d. the ramifications of the industrial 
revolution 


5. knowledge of our artistic heritage: 


a. the use of rhythm in musical expres- 
sion 

b. the works of Michelangelo 

c. the use of color in a painting by Van 
Gogh 

d. some of Robert Frost's poems. 


Thus, there are four items on each of five scales. 
The items on the five scales were randomly rotated 
and paired so that an item on one Scale would bepre- 
sented with one item from the four remaining scales. 
In this way sixteen forced-choice decisions were of- 
fered for each of the five scales. This results in 
forty pairs of items. The following example will il- 
lustrate this procedure: 


It is extremely important for the elementary 
School student to understand 


1a. why he enjoys working with his hands 
scale 2, item 1 
b. the feelings of a fellow student 
Scale 3, item 1 
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2a. the reasons for changing weather patterns 
Scale 1, item 1 
b. the use of rhythm in musical expression 
Scale 5, item 1 


Although values for the elementary school were re- 
quested during the administration of this instrument, 
it should be generalizable to any level at least in the 
Brunerian spiral curriculum model, as it is not the 
topic which is a function of level but rather the depth 
of investigation of that topic. 


THE SAMPLE 


А 10 percent random sample stratified by sex was 
drawn from the March 1969 voter list of a small, up- 
per middle-class community within a radius of 20 
miles of Boston. The Educational Values Inventory 
was mailed to 122 men and 122 women. A codenum- 
ber was assigned to each sample member and tele- 
phone follow-ups were conducted. It was found that 
the voter list was somewhat out of date and some 
sample members had moved from town or died. In 
addition, a few were away at college, in the armed 
forces, or severely ill. Out of the theoretically pos- 
sible pool of Ss, fully completed questionnaires were 
returned by 69 percent or forty-seven of the male 
sample and by 65 percent or fifty-five of the female 
sample. Possible biases which would indicate that 
those who responded and those who failed to respond 
differ in some systematic fashion were investigated. 
An analysis of the differences between respondents 
and non-respondents on amount of taxes paid, num- 
ber of children in school, and sex showed no statis- 
tically significant differences above the .10 level, 
Suggesting no bias at least on these demographic 
factors. In addition, each non-respondent was asked 
to give a reason for his unwillingness to respond. 
Approximately one third declined to give a reason; 
approximately one third said they would return the 
questionnaire, but failed to do so; and, approximate- 
ly one third were disqualified for an assortment of 
reasons. Although these results do not rule out the 

possibility of bias, they make it somewhat unlikely. 


ANALYSIS OF THE EDUCATIONAL VALUES IN- 
VENTORY 


As stated in the instrument section, the Educa- 
tional Values Inventory consists of a ranking of the 
general goal areas as well as a forced choo sing 
among these goals in specific situations. An exam- 
ination of both sections is important, as it might be 
argued that although respondents may ascribe to 
goals in theory, they may not follow their priorities 
in actual practice. Having the respondent rank the 
goal areas asks him what he believes. Having him 
make choices among concrete alternatives asks him 
how he would behave. Naturally, a further source 
of information would be to observe his behavior in 
an actual situation. 


The validity coefficients which indicate the cor- 
relations between rank and scale are presented in 


al choices consistent with their beliefs. In addition, 
the results indicate that the Educational Values In- 
ventory is valid with respect to Bruner’s goal for- 
mulation. 


TABLE 1 


VALIDITY COEFFICIENTS FOR EDUCATIONAL 
VALUES INVENTORY (N-130)? 


Validity Coefficients 


Corrected for 


Scale Uncorrected Attenuation? 
Natural World .25** 36 
Human Condition .52** .58 
Society .32** .55 
Past .36** .48 
Artistic Heritage .32жж .90 


aln addition to the scores of townspeople, scores 
for groups of teachers and administrators were in- 
cluded in this analysis. 


bAssuming equal reliability of rank and scale. 


**p «.0l 


Table 2 shows the reliability coefficients for the 
five scales. These reliability coefficients which 
range from .59 to .89 suggest that this instrument 
has satisfactory internal reliability (5). 


Due to the ipsative nature of the instrument, the 
negative correlations which exist among many of the 
Scales are expected. These are presented in Table 
3. The correlations do appear to follow a pattern 
which is intuitively satisfying in that respondents іп” 
terested in the student's social development (Human 
condition, Society) tend to be less interested in his 
mastery of skills and content (Natural world, Pasts 
Artistic heritage). 


TABLE 2 


RELIABILITY COEFFICIENTS FOR THE EDUCA- 
TIONAL VALUES INVENTORY® (N-135)b 


ЕЕЕ 
5са1е Reliability Coefficient 
—— Tity Coefficient | 


Natural World .68 
Human Condition .89 
Society .59 
Past 4 
Artistic Heritage .65 


Scale reliabilities computed from coefficient alpha, 
Cronback’s generalization of the Kuder-Richardson 
formula 20 for continuous scales (7); 


Әбее Table 1, footnote a. 
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TABLE 3 


CORRELATION COEFFICIENTS FOR THE EDU- 
CATIONAL VALUES INVENTORY (N=135)* 


Correlation Coefficient 


Scale 1 2 3 4 5 
Natural 

World 1.00 
Human 


Condition -.60** 1.00 


Society -.33**  .15 1.00 
Past 16 -.54** -,35%% 1.00 
Artistic 


Heritage .03 -.41** -.30** -.07 1.00 


aSee Table 1, footnote a. 


**p <. 0l 


In order to facilitate the investigation of the effect 
of the respondent’s sex on his choices, equal num- 
bers of males and females were included inthesam- 
ple. Analyses of the significance of the difference 
between male and female mean scores on the five 
scales as noted in Table 4 indicate that there is a 
significant (p <.05) sex effect only on the Artistic 
heritage scale. This suggests that in general male 
and female respondents hold the same values for stu- 
dents in the various goal areas. Female respon- 
dents, however, would place more emphasis on the 
aesthetic than males. This finding seems to be in 
keeping with generally accepted sex role differences. 


The statistical analyses of the measurements ob- 
tained with the Educational Values Inventory do not 
contraindicate the satisfactory construct validity and 
internal reliability of this instrument. 


AN APPLICATION OF THE EDUCATIONAL VAL- 
UES INVENTORY 


The graphs in Figures 1 and 2 indicate that the 
male and female respondents value the socialization 
goals—Human condition and Society—more highly 
than the content goals—Natural world, Past, and 
Artistic heritage. This may be interpreted as being 
both fundamental and timely. Traditionally, one of 


TABLE 4 


SEX MEANS FOR THE SCALES OF THE EDUCA- 
TIONAL VALUES INVENTORY (Male N=47, Female 
N=55) 


Natural Human Artistic 


World Condition Society Past Heritage 
Male X 8.02 8.77 10.60 7.62 4.57 


Female X 1.40 8.73 10.27 7.56 6.04 


Significance n.s. n.s. n.s. n.s. t=2.48 


p<.02 


FIGURE 1 


MEANS ON THE SCALES OF THE EDUCATIONAL 
VALUES INVENTORY 


Mean 


Score 


6 male 
---- female 
5 
4 
Natural Human Society Past Artistic 
world condition heritage 


Scale 


the roles of the elementary school has been social- 

ization. Through John Dewey’s work, this concept 

has become of fundamental importance. These re- 

sults are also timely since much of today’s concern 
centers around the stability of society and a greater 
concern for the realization of one’s self. 


In order to be consistent with the educational val- 
ues of its residents, the school system of this sub- 
urban town should have a goal framework which pays 


FIGURE 2 


MEANS ON THE RANKS OF THE EDUCATIONAL 
VALUES INVENTORY 


Mean 


Rank® 


male 
-—=- female 


———— 


Natural Human Societ Past isti 
world condition Д Кее 


Scale 


атһе lower the value, the greater the importance. 
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particular attention to the socialization of its stu- 
dents. In addition, greater emphasis should be 
placed on the natural sciences and social studies than 
on artistic endeavors. 


Naturally, the goals of a viable school system 
would reflect the values of the system’s other con- 
stituents—students, teachers, and administrators. 
If the Educational Values Inventory was used to ob- 
tain data from each of these relevant groups, it 
would be possible to select goals on a more empiri- 
cal basis than is customary. We feel that this in- 
strument can provide useful information to the edu- 
cational planner when he addresses the problem of 
goal formulation, something we hope will occur more 
frequently in the future than it has in the past. 
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AN ALTERNATIVE 
TO THE STANDARDIZED SCORE 


IN GRADING A MULTIPLE-CHOICE EXAMINATION: 


S. J. KILPATRICK, Jr. 
Virginia Commonwealth University 


ABSTRACT 


This paper describes the current grading procedure at the Medical College of Virginia and suggests that, 
rather than using the standardized score to grade multiple-choice examinations, the percent of known questions 
be estimated and used. Standardized scores tend to be misleading when used for multiple-choice questions in 
that they make no allowance for guessing. It is advocated that a passing grade be awarded to those students who 
score significantly higher than some minimum. Under this proposal the comprehensive examination at the end 
of a year or phase would be replaced by reexaminations in the various subject matters but only for those stu- 


dents who had failed to demonstrate a sufficient grasp of this material. 


THE MEDICAL College of Virginia adopted z score = 56-79 --1.78 


an integrated medical сиг riculum and a new system 


of grading in 1964 (3). Since then, all examinations 
are composed of multiple-choice questions, usually 
with four or five alternatives. Each studentis given 
a pre-coded answer sheet containing his Social Secu- 
rity number. He records his answers by marking 
one of the five ‘‘boxes’’ against each question num- 
ber. These forms are then automatically read by 

a computer which compares each student's answers 
with a master sheet, tallies the number of correct 
answers, and prints, in alphabetical and rank order, 
each student's score. This is given in four forms: 
the number of correct answers, the percentage of 
correct answers, a ‘z score which is the deviation 
of the number correct from the class mean divided 
by the class standard deviation, and a standardized 
score which is the 2 score standardized to a mean 
of 50 and a standard deviation of 10. 


Example: Consider a student, Y, who scored 66 
out of 115 questions correct in a multiple-choice ex- 
amination in which the class mean was 79 and the 
standard deviation of the class was 7.3 (Table 1). 


Then: 


Number correct = 66 
Percent correct = 66/115 = 57% 


Standardized score = 50410(-1.78) = 32 


Students are graded Honors, Pass, or Faillarge- 
ly on the basis of the standardized score. While not 
strictly adhered to, students with standardized scores 
above 70 are considered for honors and those with 
standardized scores below 30 are considered as hav- 
ing failed. This is equivalent to using 22 standard 
deviations about the mean to discriminate among the 
three potential groups (Honors, Pass, Fail) The 
justification for this appears to be that about 5 per- 
cent of the normal distribution lies outside these 
limits. This policy is consistent with the reasoning 
(3) that a B grade might be awarded to those stu- 
dents with a standardized score between 50 and 65 
(i.e., a score falling between the class mean and 1 
and 1/2 standard deviations above the mean). Inre- 
cent years, more use has been made of a student’s 
rank in the examination. Since the decision to fail 
those with standard scores below 30 is equivalent to 
failing the last two or three in a class of one hundred 
(assuming a normal distribution of scores), there 
is little difference between these approaches. 


The basic problem of grading under the integrated 
curriculum is that the committee responsible for the 
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examination does not (and perhaps cannot) establish 
the minimum passing score before the examination 
is given. One reason for this is that the material in 
a *'subject matter" examination comes from a num- 
ber of biomedical and clinical disciplines. Another 
reason is that all instructors in the subject matter 
are required to submit questions. Since the number 
of committee members is small compared with the 
number of instructors, individual members of the 
committee have little appreciation of the difficulty 
of the examination they have to grade. Asa result, 
after the examination, the committee looks at the 
distribution of scores to see how students have done 
relative to each other. 


This paper presents an approach in which the stu- 
dent’s performance is evaluated without reference to 
his peers. The method attempts to estimate what a 
student knows about the material covered. A student 
would then be given a passing grade only if he had 
demonstrated a satisfactory mastery of the subject. 
The committee has to define (preferably before the 
examination) what is a satisfactory level of knowl- 
edge. This could be done by requiring each commit- 
tee member to read those questions coming from his 
department and to state the minimum numbe r he 
would expect a passing student to know in that sec- 
tion of the examination. By combining these, the 
committee would have arrived at a minimum num- 
ber of questions a student would have to know to pass 
the examination. In turn, this figure could be con- 
verted into the equivalent minimum number of cor- 
rect answers by substitution in equation 1. 


THEORY 


Assume that the student knows «percent of the 
materialto be examined. If the n questions in the 
examination are independent and are а representa- 
tive sample from this material, the student may ex- 
pect to know nx of the questions. The remaining n 
( 1-«) questions he guesses, and assuming that these 
questions have “а” equally likely alternatives, he 
may expect to get n( 1-к)/а correct by chance. His 
expected number correct, E(s), is then 


E(s) = пх «n(1-«)/a (1) 
Rewriting the equation gives as an estimate of « 
к= (s/n-1/a)/(1-1/a) (2) 


Now «, the estimated knowledge, and nx, the esti- 
mated number of questions known, are unbiased and 
have some advantages over the standardized score: 
they are more readily understood; each student’s 
performance can be evaluated independently of his 
peers; confidence limits may be given for a stu- 
dent's knowledge of the material. 


Assume that ко is set at the minimum passing 
level of knowledge in an examination with n multiple 
choice questions each with a alternatives. We may 
calculate the probability that the itP student with the 
minimum passing level of knowledge «o has scored 
5і Ог greater out of n, as 


n 
Pr[ s>sj n, a, ко] 52 (B)pSqn-s 
1 


where q = (1-<о)(1-1/а)ів the probability that a 


TABLE 1 


EXAMPLE OF EQUIVALENT SCORES ON AN EX- 
AMINATION COMPOSED OF MULTIPLE-CHOICE 
QUESTIONS WITH FOUR ALTERNATIVES 


Class Mean ү? ge 
Number of 
questions 115 115 115 
Number correct 79 66 64 
(53 to 75)* 
Percent correct 69 57 56 
(46 to 65). 
Number known 67 50 4" 
(35 to 64) 
Percent known 58 43 41 
(30 to 55) 
z score 0.00 -1.78 -2.00 
Standardized 
score 50 32 30 


4Figures in parentheses represent 95 percent confi- 
dence limits for the estimated value. 

by is a hypothetical student whose knowledge is es- 
timated. 
Z represents a cutoff two s. d. below the mean 
(2=-2). 


student knowing «о of the material will answer а 
question incorrectly and p = 1-q is its complement. 
Thus, in amultiple-choice examination we can tabu- 
late along with the student’s score sj, the maximum 
probability that he achieve this or a higher score 
with an unsatisfactory level of knowledge of the sub- 
ject matter. Probabilities may also be calculated 
that a low score is generated by a student having a 
passing knowledge of the subject. 


APPLICATION 


Consider the results obtained by a hypothetical 
student Y in a typical subject matter examination. 
The examination consisted of 115 multiple-choice 
questions, each with four alternatives. Y got sixty- 
Six questions correct, giving him 57.4 percent cor- 
rect. His estimated level of knowledge of this sub- 


pa matter was 43.2 percent calculated from equation 
as 


k = (.974-.25)/.15 


and the 95 percent confidence limits of his knowledge 
were (30.4%, 55.3%) calculated from equation 2 as 


ky = (.478-.25)/.75 and ky = (.665-.25)/.75 


where the first value in parentheses in the above for- 
mulas, viz, .478 and .665, was obtained by interpo- 
lationinGeigy Tables (2) forn-110, n =120, and a 
frequency of 57.4 percent. It follows that the estimat- 
ed number of questions known wasll5 x .432 =50 with 
95 percent confidence limits of115 x .304 = 35 to 115 
x .553 = 64. Y therefore answered sixty-six ques- 
tions correctly, but of these he may have known from 
thirty-five to sixty-four with fifty as the most likely 
number of the questions known. His results are 
summarized in Table 1. 
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If a standardized score of 30 (two standard devia- 
tions below the mean) is used as the cutoff point in 
this examination, this is equivalent to saying that 
students must get more than 56 percent of the ques- 
tions correct. In turn, this is equivalent to the-re- 
quirement that students must be able to answer more 
than forty-seven questions out of the 115 without 
guessing or know more than 41 percent of the sub- 
ject matter. To find how wide the indifference zone 
is, that is, for what range of scores can we not dis- 
criminate between pass and fail, we ask, what is the 
range of scores of 95 percent of students such as Z 
(Table 1) with knowledge equivalent to a standard- 
ized score of 30? These limits are calculated as 
fifty-three to seventy-five questions correct. The 
use of more than one cutoff is advocated so that, in 
the examination under consideration, anyone with a 
score below 53 would fail automatically, anyone with 
a score above 75 would pass automatically, andthose 
students scoring between these limits be re-examined 
(1). However, this is extremely impractical under 
the present organization of the integrated medical 
curriculum. An alternative approach is to exploit 
the use of the comprehensive examination given at 
the end of the year and to break a student's score 
into sub-scores appropriate to the various subjects 
previously tested. Those students who, like Y, 
scored below 75 in this subject matter examination, 
would be informed that in the comprehensive exam- 


ination they would have to demonstrate a command 
of that subject significantly above the minimum re- 
quired. The drawback here is that, to be fair, such 
a comprehensive examination would involve as many 
questions as had been asked in each subject matter 
examination. The only realistic alternative appears, 
therefore, to replace the comprehensive examination 
with subject matter reexaminations taken only by 
those who had failed to demonstrate a satisfactory 
knowledge of the material earlier. 
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ENVIRONMENTAL CORRELATES OF 


DIVERSE MENTAL ABILITIES 


KEVIN MARJORIBANKS 
University of Oxford 


ABSTRACT 


The relationship between a refined measure of the home environment and four mental ability test scores: 
verbal, number, spatial, and reasoning was examined. The final sample for the study included 185 11-year- 


old boys and their parents. 


The Science Research Associates (SRA) Primary Mental Abilities test was admin- 


istered to the boys. A newly constructed home interview schedule was developed and used to obtain responses 


from parents regarding the learning environment of the home. 


The environment was found to account for a large 


percentage of the variance in verbal and number ability and a moderate percentage of the variance in reasoning 
ability test scores. For spatial ability, the relationship with the environment was less definite. It wasalso found 
that the environment accounted for more of the variance inthe mental ability scores than did a set of social status 


indicators and family structure variables. 


MUCH OF THE research that has investigated 
the relationship between the environmental back- 
ground of children and intellectual ability has con- 
centrated on using global indicators of the environ- 
ment and intellectual ability. When the environment 
has been defined in terms of social status character- 
istics, such as the occupation of the father and the 
education of parents or family structure variables 
such as the family size and crowding ratio of the 
home, only a relatively small proportion of the vari- 
ability in the intellectual performance of children 
has been explained. Also, the utilization of global 
intelligence test scores obscures many important 
differences among children. 


Therefore the purpose of this present study was 
to examine the relationship between a refined mea- 
sure of the home environment and a set of mental 
ability test scores. 


METHOD 
Mental Abilities 

In the study four mental abilities were examined: 
verbal, number, spatial, and reasoning. The men- 


tal abilities were operationalized by the scores on 
the relevant SRA Primary Mental Abilities subtests 


(1962 Revised Edition). 
Environment 


The environment was defined as being composed 


of a complex network of forces which surround the 
individual. It is assumed that a subset of the total 
network of environmental forces is related to each 
human characteristic. Thus for verbal, number, 
spatial, and reasoning ability it is proposed that 
sub-environments or subsets of environmental forces 
which will be related to each of the mental abilities 
can be identified. The union of the four sub- 
environments, which were postulated to be related 
to the four mental abilities, was defined as the learn- 
ing environment. This learning environment may 
be present in the home, school, and community. 
these, the home produces the first and perhaps the 
most powerful influence on the development of the 
mental abilities. As a result, the home was chosen 
as the focus of the present Study. 


From a review of relevant theoretical and empir- 
ical literature(1, 2, 3, 4, 5, 6, 7 ),a set of eight envi- 
ronmental forces were identified. Subsets of these 
forces were postulated to be related to the mental 
abilities. These forces were labeled: 

1. press for achievement 
2. press for activeness 
3. press for intellectuality 


4. press for independence 


5. press for English 
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6. press for ethlanguage 
7. mother dominance 


8. father dominance 


Each of the environmental forces was defined in 
terms of a set of environmental characteristics which 
were assumed to be the behavioral manifestations of 
the environmental forces. A list of the environmen- 
tal forces and the environmental characteristics is 
presented in Table 1. 


The environmental characteristics that are listed 
in Table 1 facilitated the development of an instru- 
ment for the study. The instrument, which was in 
the form of asemi-structured home interview schedule, 
was used to gain a measure of the learning environ- 
ment of the home. Thus, the environmental forces 
were operationalized as the scores on the relevant 
environmental measures constructed for the study. 


The Sample 


Approximately five hundred 11-year-old boys were 
tested, using first the California Test of Mental Ma- 
turity (CTMM) and then the SRA Primary Mental 
Abilities Test. The first test-taking situation was 
used to establish examiner-examinee rapport, to in- 
sure that all students were able to understand the 
test instructions, and to establish as far as possible 


‚© uniform test-taking situations. The boys were as- 


signed to two categories, one classified as middle 
class and the other as low class. The social class 
classification was based on an equally weighted com- 
bination of the occupation of the head of the household 
and a rating of his (or her) education. As far as 
possible two parallel pools of boys were formed. 
The purpose of the substitute pool was to provide a 
set of alternate families which could be used in the 
study if families from the first pool did not agree to 
participate. 


The final sample consisted of ninety boys and 
their parents classified as middle class and ninety- 
five classified as low class. 


HYPOTHESES 


In the development of the study it was postulated 
that subsets of environmental forces which would be 
related to the mental abilities could be identified. 
Therefore the following hypothesis was investigated: 


Hypothesis 1: The verbal, number, spatial, and rea- 
soning ability test scores will be significantly re- 
lated to subsets of scores of environmental forces. 


It was also proposed that the utilization of subsets 
of environmental forces was a means of moving be- 
yond the use of gross classificatory variables such 
as social status indicators and family struc ture 
characteristics as measures of the environment. 
The advantage of using the sub-environment approach 
was investigated by examiningthe following hypothesis. 


Hypothesis 2: Scores on the environmental forces 
will be more highly related to measures of ver- 
bal, number, spatial, and reasoning ability than 
will other environmental measures such as social 
status indicators and family structure variables. 


TABLE 1 


THE ENVIRONMENTAL FORCES AND THEIR 
RELATED ENVIRONMENTAL CHARACTERISTICS 
USED IN THE INTERVIEW SCHEDULE 


Environmental Force Environmental Characteristics 


1. Press for 

Achievement la. Parental expectations for 
the education of the child 

lb. Social press 

lc. Parents’ own aspirations 

ld. Preparation and planning 
for child's education 

le. Knowledge of child's educa- 
tional progress 

11. Valuing educational accom- 
plishments 

16. Parental interest in school 


2a. Extent and content of indoor 
activities 

2b. Extent and content of out- 
door activities 

2c. Extent and purpose of the use 
of T. V. and other media 


2. Press for 
Activeness 


3a. Number of thought provok- 
ing activities engaged in by 
children 

3b. Opportunities made available 
for thought provoking dis- 
cussions and thinking 

3c. Use of books, periodicals, 
and other literature 


3. Press for 
Intellectuality 


4a, Freedom and encouragement 
toexplore the environment 
4b. Stress onearly independence 


4, Press for 
Independence 


5. Press for 5a. Language usage and rein- 


Fnglish forcement 
5b. Opportunities available for 
language (English) usage 
6. Father 6a. Father’s involvement in 
Dominance child’s activities 
6b. Father's role in family de- 
cision making 
7. Mother Ла. Mother's involvement in 
Dominance child's activities 


Tb. Mother's role in family de- 
cision making 

8a. Ethlanguage usage and 
reinforcement 

8b. Opportunities available 
for ethlanguage usage 


8. Press for 
Ethlanguage 


RESULTS 


Before examining the hypotheses of the study it 
was considered desirable to investigate the reliabil- 
ity of the home environment Schedule constructed 
for the study. 


The reliability coefficients for each scale are 
shown in Table 2. The coefficients were estimated 
by determining coefficient alpha (7). 
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Since the study is concerned with the size of the 
correlations between environmental forces and men- 
tal abilities, it was considered that the reliability 
coefficients were of an acceptable level. 


Hypothesis 1: The verbal, number, spatial, and 
reasoning ability test scores will be significantly 
related to subsets of scores of environmental 
forces. 


The first analysis of the hypothesis involved an 
examination of the zero-order correlations between 
the scores of the four mental ability tests and the 
scores of the environmental forces. These latter 
scores were computed from a simple summation of 
the scores of the environmental characteristics which 
were used to define the environmental forces. The 
zero-order correlations are presented in Table 3. 


The results in Table 3 indicate that the parental 
dominance dimensions had either low or negligible 
relationships with the mental abilities. To investi- 
gate the relationship between the two parental dimen- 
sions and the other environmental forces, a princi- 
pal component analysis of the eight environmental 
forces was conducted. The unrotated factor loading 
matrix of the interrelations among the forcesis pre- 
sented in Table 4. Only those factors with an eigen- 
value greater than unity have been included. The 
third factor had an eigenvalue of . 65. 


It can be observed from Table 4 that all of the en- 
vironmental forces load strongly on the first factor. 
This general factor was labeled the learning envir- 
onment of the home factor. The second factor, which 
loads heavily on the parental dominence forces, was 
labeled the parental dominance factor. 


When interrelationships between the scores onthe 
two factors and the mental ability test scores were 
examined it was found that the scores on the learn- 
ing environment of the home factor were significantly 
related to scores on each of the mental abilities. 
None of the relationships between the scores on the 
parental dominance factor and the scores onthe men- 
tal ability tests reached statistical significance. 


TABLE 2 


RELIABILITY COEFFICIENTS OF THE ENVIRON- 
MENTAL SCALES (N - 185) 


—————————— Áo 
___—_——————— 
Reliabil- Number Standard 


ity Coef- of Deviation 

ficient Items of Scores 
Press for Achievement .94 50 35.18 
Press for Intellectuality . 98 18 17.05 
Press for Activeness - 80 25 11.29 
Press for Independence .71 16 8.72 
Press for English - 93 20 17. 83 
Press for Ethlanguage - 90 15 14.4 
Father Dominance .67 22 9.22 
Mother Dominance .66 22 10. 33 


TABLE 3 


INTERRELATIONSHIPS BETWEEN THE MENTAL 
ABILITY TEST SCORES AND THE SCORES OF 
THE ENVIRONMENTAL FORCES (N - 185) 


Environmental 
Force Abilities 
Verbal Number Spatial Reasoning 
Press for 
Achievement .66** ,66Жж „‚28**  ,39** 
Press for 
Activeness .52** .41** ,22**  ,26** 
Press for 
Intellectuality .61** ,53%Ж .26** .31%ж 
Ргезз іог 
Independence .42** .34** |10 .23** 
Press for 
English .50** .,2T** .18**  ,.2g** 
Press for 
Ethlanguage .35** .24** .09 .04 
Father 
Dominance AGE 2:10 .09 11 
Mother 
Dominance .21** .16* .04 .04 
ee M ыы УА Be 
*p«.05 
**p« .01 


1. Ethlanguage refers to any language other than 
English used in the home. 


Because of: (1) the exploratory nature of the 
study in identifying sub-environments for mental 
abilities, and (2) the presence of a general factor: 
it was decided to utilize the eight environmen 


forces as the sub-environment for each menta 
ability. 


The relationship between the learning environment 
of the home and each mental ability was examined 
computing the multiple correlation betweenthe eight 
environmental forces and each mental ability. Inthis 
analysis the environmental forces formed a predicto? 
Set and the mental abilities formed the criterion Vac 


tors. The results of this analysis ted in 
Table 5. ysis are presen! 


The results in Table 5 indicate that when the ей” 
vironmental forces are combined into a set of pre- 
dictors they account for a large percentage of the 
variance in verbal and number ability test Score? 
and a moderate percentage of the variance in the хей 
soning ability test scores. For spatial ability, the 
corrected multiple correlation did not reach statis" 
tical significance. 


Thus the analysis of the data supports the пуро“ 


1 А e 
esis that verbal, number, and reasoning abilities 524 


related to subsets of environmental forces. For SPs 
tial ability, the relationship with the environment, 
measured in this study, was less definite. 
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TABLE 4 


UNROTATED FACTOR LOADING MATRIX OF THE 
ENVIRONMENTAL FORCES 


—— -  - 


Environmental Force Factors 

1 11 h? 
Press for Achievement - 83 .06 .69 
Press for Activeness .90 .08 .82 
Press for Intellectuality 91 .01 .83 
Press for Independence .64  -.25 .47 
Press for English 176 -.10 .58 
Press for Ethlanguage .15 .10 .9T 
Mother Dominance .40 .84 87 


Father Dominance 
Eigenvalues 4.196 1.533 


Percentage of Variance 
Account for 52.4 


Cumulative Percentage of 


Total Variance 52.4 11.6 


Hypothesis 2: Scores on the environmental forces 
will be more highly related to measures of ver- 
bal, number, spatial, and reasoning ability than 
will other environmental measures such as social 
status indicators and family structure variables. 
In Table 6 the zero-order interrelationships be- 

tween a set of gross classificatory measures of the 


TABLE 5 
MULTIPLE CORRELATIONS OF EACH OF THE 
MENTAL ABILITY SCORES WITH THE EIGHT 


Corrected? Percentage 


Multiple Multiple of Total 
Mental Correlation Correlation Variance 
Ability R Rc Rc 
Verbal 412% ТАРЕ 50.4*** 
Number .T2ee cere 50.4#* 
Spatial ‚32** .26 6.7 
жжж 
P «.001 
9 <.01 
P<.05 


a, Р i- 
Corrected to allow for cumulative errors in multi 


ple В, and for small sample size. 


environment and each of the mental abilities have 
been presented. 


A qualitative inspection of Tables 3 and 6 indi- 
cates that, in general, the environmental force 
Scores are more highly related to the mental ability 
test scores than are the gross indicators of the en- 
vironment. 


A set of multiple correlation analyses was con- 
ducted in order to compare the effectiveness of the 
environmental force scores and the gross indicators 
as predictors of the mental ability test scores. In 
these analyses the amount of variance that could be 
attributed to the environmental forces was computed 
after accounting for the variance that couldbe attrib- 
uted to the gross indicators of the environment. The 
results of the analyses are presented in Table 7. 


The results in Table 7 indicate that the learning 
environmental forces account for 25 percent of the 
variance in verbal ability test scores, 34 percent 
of the variance in number ability test scores, and 
12 percent of the variance in reasoning ability test 
scores after the variance due to the combination of 
social status characteristics (occupation of father, 
education of father, education of mother) and family 
structure variables (number of children, ordinalpo- 
sition, crowding ratio) has been allowed for. For 
the spatial ability test scores the corrected multiple 
correlation coefficient for “environment” did not 
reach statistical significance. 


Thus, the results provide support for the general 
acceptance of the second hypothesis. 


CONCLUSION 


The results indicate the efficacy of utilizing the 


TABLE 6 

INTERRELATIONSHIPS BETWEEN GROSS INDI- 

CATORS OF THE ENVIRONMENT AND MENTAL 

ABILITY TEST SCORES (N = 185) 

———————M——ÀÀs 
Gross Indicators Mental Abilities 


Reason- 
Verbal Number Spatial ing 


Education of 
Father .29** .27** — .26** .22** 


Education of 
Mother .39e* .33** .21** ,16% 


Occupation of 
Father .49** .30**  .31** .29** 
Number of Children 
in family -.32** -.33** -.04 -.03 


Crowding Ratio -.34** -.34**  -.07 -.09 


Ordinal Position 
in family -.26%ж -.25** -.04 -.04 


E d c > а-.. 
жәр C. 01 
жр <.05 
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TABLE 7 


RELATIONSHIP BETWEEN MENTAL ABILITIES, ENVIRONMENTAL FORCES, AND GROSS INDICATORS 


OF THE ENVIRONMENT 


Criterion 


Computed 
Multiple 


Percentage of 
Total Variance 


Corrected 
Multiple 


Predictor Variables Correlation R Correlation Вс Rc* 


Verbal Ability А-б status variables+8 environmental forces 


. peek -11%%ж 51.0*** 
В=6 status variables .53жжж ‚б1*** 26.0*** 
С=А-В 25, 0жжж 
=environment 
Number Ability A-6 status variables + 8 environmental forces T2*** „Тїкє 50.0*** 
В=6 status variables .42**» .40%%% 16.0%%% 
С-А-В 34,0*** 
=environment 
Spatial Ability А-6 status variables + 8 environmental forces .38** .36* 13.0* 
В-6 status variables .31%%% .28%%ж 8.0» 
С-А-В 5.0 
Reasoning A=6 status variables +8 environmental forces „41*** .42** 18,0** 
Ability В-6 status variables .29** .25 6.0 
С=А-В 12. 0** 
= environment 
***p« ‚001 
**р<.01 
%р<.05 


sub-environment approach in analyzing the relation- 
ship between the environment and intellectual per- 
formance, 


The study has theoretical, methodological, and 
practical significance. Evidence concerning the en- 
vironmental correlates of diverse mental abilities 
is central to theory construction in developmental 
psychology. Such evidence provides a clarification 
of both the basic nature and function of the mental 
abilities themselves and the characteristics of the 
environmental conditions that influence their devel- 
opment. 


Methodologically the study has significance as a 
new instrument was developed in order to assess 
environmental variation. The results also relate 
to the practical efforts that are being made to deter- 
mine the optimal educational conditions for children 
from diverse environmental backgrounds. If schools 
are to complement the home background of students 
it is necessary to know the specific effects that stu- 
dent background factors have on intellectual functioning. 


Thus the study in its investigations of the relationship 
between the environment and mental abilities has the- 
oretical, methodological, and practical significance, 
The results also indicate that it is possible to move be- 
yong the use of global indicators of the environment 

‘© a much more detailed assessment of environments. 
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ABSTRACT 
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Aiken and preger (1, 2). There were 68 male and 
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Тһе correlation between the two instru- 
dy was as effective a 
data also indicated that 
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теше. with a Likert type attitude inst ene constructed by 

in th subjects, all of whom were non-mathematics majo опей! 

me e College of Education of a large state university in the soutl a А; 

Le к= 190, It was concluded that the semantic differen ү" ^s 
re of attitude toward mathematics as the Likert tyP E Mr іне ud to Hie ed exem 
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> other p E attitude ofa more easily constructed semantic differential could 
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the Mathematics Attitude Scale (MAS) 
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th toward a given object, or C of objects, 
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nal and other dispositional variables to P p xem 
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t and explain reactions of the person to 
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The present study made use of the MAS developed 
by Aiken and Dreger (1, 2). This instrument is an 
opinionnaire which makes use of a 5-point scale 
ranging from strongly disagree to strongly agree on 
each item. The MAS consists of twenty items with 
ten items stated positively and ten items stated neg- 
atively. 


Some examples of the positive items are Item 4, 
**Mathematics is fascinating and fun,” and Item 11, 
**Mathematics is something which I enjoy a great 
deal." An example from the negative items is Item 
7, ‘I feel a sense of insecurity when attempting 
mathematics. " 


The instrument is scored to reflect a positive at- 
titude toward mathematics by assigning a1to strong- 
ly disagree and a 5 to strongly agree on the positive 
items and conversely on the negative items. Ascore 
is obtained by summing the value assigned to a S's 
response on each of the twenty items. 


The test-retest reliability for the instrument, ac- 
cording to Aiken and Dreger (1, 2) is r - .94, Con- 
tent validity is assumed; however, a test of indepen- 
dence between the scores on the attitude scale and 
Scores on four items designed to measure attitudes 
toward academic subjects, in general, suggested 
that attitudes specific to mathematics were being 
measured. Aiken and Dreger also established pre- 
dictive validity coefficients ( mathematics achieve- 
ment) of .67 апа. 63 for males and females respec- 


tively. Both coefficients were foundtobe statistically 
Significant. 


Semantic Differential Scales of the type developed 
by Osgood (4) have proven useful to researchers in 
quantifying highly subjective data. The semantic 
differential used in this study was designed to mea- 
sure attitude toward mathematics. The conceptused 
was MATHEMATICS. The instrument consisted of 
fifteen bipolar adjectives placed at opposite ends of 
а 7-point continuum, e. g.: 


The adjectives used were: Pleasant - Unpleasant; 
Bad - Good; Hard - Soft; Afraid - Unafraid; Active - 
Passive; Valuable - Worthless; Strong- Weak; 
Love - Hate; Fast - Slow; Comfortable - Uncomfort- 
able; Awful - Nice; Enjoyable - Unenjoyable; Light - 
Heavy; Varied - Repetitive; and Secure - Insecure. 


Upon analysis of the scales, higher scores, or 
more favorable scores, resulted directly from the 
extent to which the perceived entity was rated clos- 
est to the following poles: Pleasant, Good, Soft, 
Unafraid, Active, Valuable, Strong, Love, Fast, 
Comfortable, Nice, Enjoyable, Light, Varied, and 
Secure. 


The semantic differential was constructed accord- 
ing to the criteria given by Osgood (4). Unlike the 
construction of the MAS, elaborate item analysis 
procedures and repeated revisions of the semantic 
differential instrument were not necessary. This 


constitutes a major advantage of semantic differen- 
tial technique. 


TABLE 1 


MEANS AND STANDARD DEVIATIONS FOR 
SEMANTIC DIFFERENTIAL SCALES 


Standard 
Bipolar Adjectives Mean Deviation 
Pleasant-Unpleasant 4.4106 1.9276 
Bad-Good 4.9559 1.7142 
Hard-Soft 2.8235 1.3376 
Afraid-Unafraid 4.2059 1.9122 
Active- Passive 4.8676 1.7611 
Valuable-Worthless 5. 8824 1.3442 
Strong-Weak 4.8088 1.7557 
Love-Hate 4. 2206 1.4439 
Fast-Slow 4.2059 1.7239 
Comfortable-Uncomfortable 4.1618 1.7923 
Awiul-Nice 4.4118 1.5572 
Enjoyable-Unenjoyable 4.5000 1.7577 
Light-Heavy 3.1765 1.5056 
Varied-Repetitive 4.8382 1.7157 
Secure-Insecure 4.0000 1.7364 


———À——— "€ 
Procedure 


1970. The sixty-eight male and female Ss, all of 

whom were non-mathematics majors, were enrolled 
in a required doctoral level statistics 
College of Education. One-half 
the MAS. Approximately one wi 
were given the semanti 
authors. The same pr 
other half of the group 
ments was reversed, 


class in the 

of the Ss were given 
eek later, these SS 

© differential designed by the | 
‘ocedure was followed with the 
except the order of the instru" 


The data were collected during the summer of 


Treatment of Data 


Means and standard deviation for all fifteen scale? 
9n the semantic differential are presented in Table 
l. To determine the evaluation scales on the seman” 
tic differential, a factor analysis was performed. 
The results of this analysis appear in Table 2. 
factors emerged. Factor I appeared to be the eV 
uative factor and was associated with 57. 4 perce™ _ | 
of the explained variance. Factor П, a potency? C | 
tor, accounted for 10.6 percent of the explaine : 
variance. The bipolar adjective pair, secure” | 


1- 


=з appeared to be associated with both 1207 
rs. 


A summative гайп i sub- 
g Was obtained for each е 
ject on the MAS, the eleven evaluative scales ОП © 


McCALLON -BROWN 


TABLE 2 
FACTORS AND FACTOR LOADINGS RESULTING 


TROM SUBJECTS EVALUATION OF MATHEMAT- 


Factor Factor 


Bipolar Adjectives I n h? 
Pleasant-Unpleasant .12 .53 .80 
Good-Bad .85 .21 .81 
Active-Passive .81 .15 . 68 
Valuable-Worthless «15 .12 .57 
Strong-Weak .85 .10 „3% 
Love-Hate .81 .32 .16 
Fast-Slow ‚62 4 m .41 
Comfortable-Uncomfortable .71 -49 ."4 
Awful-Nice .15 247 278 
Enjoyable-Unenjoyable .73 .50 ‚18 
Varied-Repetitive - 56 .04 31 
Hard-Soft ‚14 „19 . 64 
Afraid-Unafraid .43 . 66 .63 
Light-Heavy .08 .82 .67 

.64 .64 .83 


Secure-Insecure 
“Varimax rotation 
a total semantic 


d of all fifteen scales. 
asures are рге- 


a lando differential, and 

T erential score, compose 
he intercorrelations of these me 

Sented in Table 3. 

5 Although a factor analysis Was performed to iden- 

ay the evaluative scales, validity consideration (of 

эр semantic differential) dictated another approach 

the analyses of the data. 


TABLE 3 


INTERCORRELATIONS AMONG THE MAS AND 
MANTIC DIFFERENTIAL SCORES 


Measures Intercorrelations? 
из 1.00 .87 .90 
S 
D (Evaluative Scales) 1.00 .9T 
1.00 


8D (Au Scales) 


БУУП 
correlations are significant at the . 001 level 
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TABLE 4 


DIFFERENCES BETWEEN MEAN RATINGS ON 
SEMANTIC DIFFERENTIAL SCALES FOR FA- 
VORABLE AND UNFAVORABLE ATTITUDE 


GROUPS 


du — ——— M— ЕС. 
———————— 


Group АР Group В 
5са1е Меап Mean Difference 
e—a 


Pleasant- А 
Unpleasant (Е ) 5.58 2.56 3.02 


Enjoyable- 

Unenjoyable (Е) 5.51 2.16 2.15 
Comfortable- 

Uncomfortable (E) 5.16 2.44 


Good-Bad (Е) 5.86 3.40 2.46 
Awful-Nice (E) 5.25 2.96 2.29 
Active-Passive (E) 5.69 3.44 2.25 
Strong-Weak (E) 5.60 3.44 2.16 
Afraid- 

Unafraid (P) 4.93 2.96 1.97 
Love-Hate (E) 4.93 3.00 1.93 
Fast-Slow (E) 4.88 3.04 1.84 
Valuable- 

worthless (E) 6.37 5.04 1.33 
Hard-Soft (P) 3.20 2.16 1.04 
Varied- 

Repetitive (E) 5.16 4.28 .88 

3.44 2.12 ‚12 


Light-Heavy (Р ) 


ЖЕ) represents ап evaluative scale, (Р) a potency 


scale. 
d a favorable attitude toward mathemat- 


bGroup A һа! 
ics. Group B had an unfavorable attitude toward 


mathematics. 


The Ss in the study were divided into two groups 
ith a favorable at- 


Means on the fifteen scales of the 
semantic differential were computed for each group. 
Table 4 presents the bipolar adjectives for each of 
the two factors arranged from largest mean-scale 


differences to small 
secure-insecure, was omitted since it did not belong 


to a single factor. 


CONCLUSION 


As the data presented in Table 3 indicate, there 

is a high positive correlation ( r = .90) between the 

total score оп the semantic differential andthe score 
There is also a high positive correla- 


on the MAS. 
tion (т = .87) between the total score on the 
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evaluative scales of the semantic differential and 
the score on the MAS. These correlations are sig- 
nificant at the . 001 level. 


It was therefore concluded that the semantic dif- 
ferential constructed for this study proved to be as 
effective a measure of attitude toward mathematics 
asthe MAS. Considering the ease with which the 
semantic differential was constructed and the fact 
that no extensive refinement of the instrument was 
necessary, application of the semantic differential 
technique would appear to be a more practical ap- 
proach to the measurement of attitudes in mathemat- 
ics. 


From the data presented in Table 4, it is evident 
that the evaluative scales as determined by factor 
analysis did indeed represent those scales on which 
the greatest scale mean differences occurred be- 
tween people who viewed mathematics favorably and 
those who view mathematics unfavorably. Thus Ы 
construct validity for the semantic differential can 
be inferred in that it exhibited both internal (factor 
analysis) and external (correlational) validity. 


SUMMARY 


A semantic differential constructed by the authors 
was contrasted with a Likert attitude instrument, the 
MAS, constructed by Aiken and Dreger (1, 2). The 
sample consisted of sixty-eight graduate students who 
were not mathematics majors, but were required to 


take a statistics course as a part of their graduate 
study. 


Correlation coefficients computed among the sets 
of total semantic differential scores, the semantic 


differential evaluative scales scores, and the MAS 
Scores indicatedthe semantic differentialwas as effec- 
tive a measure of attitude toward mathematicsas the 
MAS. The significance of these relationships is dis- 
cussed. 


Further analysis of data substantiated the hypoth- 
esis that people possessing favorable and unfavor- 
able attitudes toward mathematics would differ to 
the greatest extent on the evaluative scales of the 
semantic differential, thus lending construct valid- 
ity to the semantic differential. 
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ABSTRACT 


Birth order was used to predict grades and school related 
terested and less socially i 


first-borns are more academically ілі 
child, and size of family wer 

absence of birth order main e 
than as a single independent vari 


ffects whic! 
able. 


. A NUMBER OF studies have related academ- 
ic behavior to a child's ordinal position in the fami- 
ly. Schacter (6) and Sampson (5) have shown that 
first born children are overrepresented in college 
Populations, have higher grade point averages 
(GPA) and higher need achievement scores than 
ater borns. Bradley and Sanborn (2) examined high 
School teacher selections for *tsuperior student" 
counseling and found that first borns were signifi- 
cantly overrepresented. They concluded that social 
behaviors and attitudes of the children influenced the 
teacher selections, since there were no birth order 
intelligence differences. 


Bradley (1), ina review, 
n rns were more adult oriented, 
Hin popular than later borns. He concluded that 
E borns were more expose 
pecia pressures because 0 
nce, would reflect these pressures more than lat- 
ег borns. Bradley also hypothesized that first borns 


БЕ Д9 perform better academically, be more inter- 
in ted in academic activities, and be less interested 
extracurricular activities than later borns because 


of 
the same social pressures. 


re (4) have stated that 
nted in college popula- 
that GPA and school 
lated in à 


tirat obell (3) and McClu 

tions rns are оуеггергеве! 

relat but have given evidence 

sim, a behaviors and attitudes are not ге 
ple way to birth order- 

y's nypotheses 


The present study used Bradle 
ades and 


ab 
Out birth order effects to predict £T 


attitudes following Bradley's hypothesis that 
nterested than later borns. Income, sex of 
Several interaction effects were found in the 


e studied as interacting variables. 
h indicate а need to study birth order as an interacting variable rather 


academic attitudes. It further studied the interac- 
tion effect of sex of child, income of father, and 
size of family with birth order in an attempt to find 
out what variables cause differences in school relat- 


ed attitudes. 
METHOD 

Subjects were 312 freshman and sophomore psy- 
chology students of both sexes. Ап anonymous ques- 


tionnaire with fourteen questions was answered with- 
in 10 minutes at the start of a regular class period. 
The first six questions established background data 
such as age, Sex, birth order, size of family, and 
GPA. The next four questions used a 10-point rating 
scale to assess attitudes about academic and extra- 
curricular activities, which are described in Mc- 
clure (4). The last four questions gave Ss an oppor- 
tunity to select one of three behavior preferences 
such as а choice among “living alone, living with a 
roommate, ог living ina fraternity.” The only in- 
structions given for the questionnaire were: “Please 
fill out this questionnaire anonymously for some re- 


search that is being done. ” 


RESULTS 


There were no main effects of birth order to predict 
grades or attitudes, although there were several trends 
inthe predicted directions. For example, first borns 
had higher mean GPA’s thanlater borns by chi-square 
analysis! (Х =16.49, df-9, p=.10). Also, first borns 
reported a trend toward more academic enjoyment 
than later borns (X?=15.02, dí-9, р-.10). 
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There were several interesting interaction differ- 
ences which were significant, however. Colle ge 
GPA’s of first and later born males were compared 
by chi-square, with every S having completed at least 
one semester of college. This revealed that first 
born males had a significantly higher average than 
later born males (Х2-18.77, df=9, р-.05). Howev- 
er, when the mean GPA’s (2.65 and 2.48, respec- 
tively) were compared by analysis of variance only 
2 trend at the .10 level was found ( F-3.57, dí-1/ 

176). There was no Significant difference in GPA 
for first and later born girls (Х2-5.96, df=9, non- 
significant). 


The academic attitude scales were all nonsignif- 
icant, but the behavior preference questions showed 
definite differences between first and later borns. 


When upper income ($11,000 plus) children were 
given a choice among reading a book, watching TV 
with a friend, or talking to a friend, first borns were 
Significantly more likely to read a book and late r 
borns to talk toa friend (X°=6.25, @=2, p-.05). 
This difference did not occur with lower income 
children. 


When large family (4 or more siblings) children 
Were given a choice among the same behaviors, first 
borns chose to read a book and later borns to watch 
TV with a friend or talk to a friend (X*-6.73, 4-2 


p-.05). This difference did not occur in smaller 
family sizes. 


When large family children were given a choice 
among watching a movie, going to a football game, 
Ог going to a party, first borns Showed a trend to 
chose a movie, while later borns chose the party 
(Х2-5.91, df-2, p=.06). There were other similar 
trends, which were in the direction of Bradley’s hy- 
potheses, but nonsignificant. 


These findings indicate several things. First, 
there may be attitudinal and behavior choice conse- 
quences of being first born which influence school 
achievement. Second, these attitudes may not be re- 
lated in any simple way to birth order. There were 


interactions with sex, income, and size of family in 
this study before birth order effects became signifi- 
cant. Third, although the results were in the direc- 
tion hypothesized by Bradley and others, the nature 
of the influence of the other variables needs to be ex- 
plored further. Do the other variables magnify the 
effects of birth order, or do they cause the effect? 
Have positive results of birth order influence inoth- 
er studies been caused by birth order or some of 
these other variables? 


FOOTNOTE 


1. Used for simplicity. Where significant on ratio 
data, analysis of variance are used subse- 
quently. 
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of quantity has been demonstrated by many researchers to apply to 
get’s i 5 t 4 ldren. This investigation set out to determine the relevance of Pia- 
S ideas to a particular ethnic group in Sierra Leone. Mende pupils ( N-231) of three age groups in sixteen 
illages in four districts were interviewed on three tasks. The results showed that 
p <.01) scores than those younger than т. ЧЕ 


Schools located in towns or vi 
pupils in the 7 to 8 year age group received significantly higher ( 
k and that there are individual 


d concluded that conservation of quantity varies with chronological age and tas! 
ifferences. 


Ж Piaget’s (4) principle of conservation 
merican, European, and West African chil 


this comprehension is a principle which Piaget calls 
cardination. Again, on the average, only children 


PIAGET (4) claims that the notion of number 
5 of age or older can apply this principle. 


d in children and that there 


me gradually develope 
allo авва in this development. For him, the stages 7 year 
enn for individual differences among children but Moreover, such children, unlike younger ones, are 
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gations the years before 7 are years in which Many studies in Canada, the United Kingdom, the 
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development of quantity concepts 
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as it were, carried theories with the 
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е child’s understanding of quan 
f these have been reviewed 


in; E 
pea a his perception. He is, 
may b у physical appearance. For him a number as a springboard. Some 0 ony 
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ee dap depending on circumstances. However, words of Aimy: 
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erature supports the notion that 


i d in 
ЕЕ studies in the lit 


Certai 
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Ww 
Ords of Piaget: 
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ive. This conclusion, 


Th а 
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the ae of the subject, put coordinates all ception TE qud involving children of differ 
Tecipy erent viewpoints in à system of objective ent ages (1:34)- 
REM ING The present study aimed to evaluate Piaget's po- 
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пав stutionaly, Piaget's Geneva gehool of Thought EA оп “est ethnic group in Sierra Leone, the Mende 
Ten to ed and continues to stud he ability of chil- Following the results of a pilotproject which involved 
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equi ence, m ets of objects in one=to-one m aisty Mende T t тока, sectional study о ат 
Alence о108е children who understand 122 ementary 8000048; 
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Sets are able to relate. 
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most of Mendeland was designed. It looked at 231 
Mende pupils in each of the four districts in which 
the Mende are the majority tribe: Bo, Kenema, Mo- 
yamba, and Pujehun. 


METHOD AND PROCEDURE 


Subjects 


The design called for a sample of 240 pupils, fif- 
teen in each of sixteen schools stratified in terms of 
district, location in chiefdom town or village, and 
the chronological age of the children. The schools 
in each district located in villages were arranged 
alphabetically and from each group two schools were 
chosen resulting in a total of eight. Similarly, eight 
schools were selected from those located in chiefdom 
towns. In each school, five children were selected 
randomly from each of the three age-group popula- 
tions: 5 but less than 6, 6 but less than 7, and 7 but 
less than 8. All the requirements were satisfied by 
231 pupils. 


Tasks 


The Ss were given an oral interview whi ch in- 
volved testing for performance on three tasks. The 
sequence of tests on each task was preceded by a 
training session. 


The training for Task 1 was designed to insure 
that each S could count up to ten in both English and 
Mende since ability to count to that level was essen- 
tial for the test. Ten Star bottle tops were arranged 
in two equal rows for this purpose. 


When it was clear that the child could count up to 
ten, the bottle tops were arranged in three rows of 
3-4-3. The child was then asked to tell the number 
of bottle tops without counting. Next, the ten tops 
were arranged in one row and again the number of 
tops was asked for. Finally, the S was expected to 
give a reason for his or her answer. 


This task involved the conservation of the number 
of two rows of bottle tops (that had been counted) 
through two transformations. A candidate could 
score 0 or 1 point on each of three questions. A to- 
tal score of 2 or 3 was accepted to represent conser- 
vation. 


Similarly, Task 2 incorporated the concept of the 
conservation of equality of two rows of bottle tops 
through two transformations. The idea was to com- 
pare the number of seven Star bottle tops with the 
number of seven Sprite bottle tops by using two dif- 
ferent arrangements. Each candidate could score 
0 or 1 on each of seven questions. A total score of 
4 or more was considered to represent conservation. 


Шап” and therefore the meaning of the equality of 
two weights. Two equal match boxes were used. 
One was loaded with matches while the other was 
loaded with stones, Each S was made to feel the two 


а 3-inch shoe lace woul 


weight of an assemblage which included the shoe lace, 
a shallow container about 5 inches in diameter, and 
four match sticks changed. Allowable scores were 
0 or 1 on each of three questions. A score of 2 or 3 
was necessary for conservation. 


RESULTS AND DISCUSSION 


First, the data were analyzed qualitatively. A 
noticeable trend was the failure of the average less- 
than-7 year old to give good reasons for “уез”! or 
“по”? responses. The typical member of this set did 
not respond well to questions like “Why do you tl 
502” Other representative reactions from such chil- 
dren were: ''The tied ones are heavier because they 
are tied.” “Тһе loose ones are heavier because they 
are spread ош.” 


On the other hand, the typical child older than 7 
invariably came up with a reason that made sense. 
Where the less-than-7 year old would start recount- 
ing ten bottle tops which were spread out in his pres" 
ence after he had counted them, the typical pupil 
above 7 would respond “ʻI know that there are ten be- 
cause I had counted them and you have not removed 
any." 


Second, the data were analyzed statistically. таң 
ble 1 shows the percentages ої pupils of different кр 
groups who verbalized ability to conserve. Bartlet! b 
test of homogeneity indicated that the sample repre 
sented a homogeneous population. 


A 


A multiple classification analysis of variance waS 
carried out (see Table 2). Three null hypotheses 
were used. The first was that there was no differ- 
ence in mean scores due to district. The second 
was that village pupils obtained essentially the same 
Scores, on the average, as pupils from chiefdom 
towns. The third was that there were no differences 
among the average scores of the three distinct age 
groups used in the study. r 


There was no basis for rejecting two of the three 
null hypotheses since the F values of 1.58 and 2.53 
(see Table 2) were less than necessary for signifi- 
cance beyond the .05 level, Therefore, the corre- 
sponding variables (location in town or village and 
district) were not responsible for any differences 
in performance on the tasks. Moreover, noneofthe 
various interactions among variables were signifi- 
cant (see Table 2). 


Ontheother hand, the variable age resultedin an i 
value of 154.05 (see Table2). This Fvalue was sign r^ 
icant beyond the .01 level. Obviously, such a diffe d 
ence in performance could not be attributed to сһал©! 


TABLE 1 
PERCENTAGES OF PUPILS OF DIFFERENT AGÉ 


GROUPS WHO CONSERVED 
=——=======—=——————===@ 


TASK AGE 2 
5 but less б but less 7 but 169 
than 6 than 7 than 
One 36.0 52.6 92.0 
Two 24.0 27.5 19. 


ОНОСНЕ 


TABLE 2 


ANALYSIS OF VARIANCE FOR THE SCORES OF 
231 PUPILS 


Source |Ж 8: М8 E 


District 3 30.84 10.28 2.53 
Town or Village 1 6.42 6.42 1.58 
Age 2 1250.90 625.45 154.05* 
Age x District 6 25.50 4.25 1.05 


Age x Town or 


Village 3 18.26 39.13 9.64 
District x Town 

or Village 3 20.32 6.77 1.67 
Age x District x 

Town or Village 6 40.89 6.82 1.68 
Within 207 841.27 4.06 


Total 230 2294.40 


*p «.01 


This significant difference permitted the use of 
the Tukey procedure (6) for comparing means (see 
Table 3). Two of the differences, 5.06 and 3.70 (see 
Table 3), were greater than the computed difference 
of 2.22. Thus, the conclusions were drawn that the 
7 but less-than-8 age group pupils were superior in 
their performance to both the 5 but less-than-6 age 
group and the 6 but less-than-7 age group. 


TABLE 3 


TUKEY COMPARISON OF MEANS 
Mean-5.06 Mean-6.42 


Age gro Mean 
7 but less than 8 10.12 5.06 3.70 
6 but less than7 6.42 1.36 0.00 


4 5butlessthan6 5.06 0.00 


erence would appear not to 


Furthermore, this diff А 
lity to verbalize 


have been due to difference in abi 
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since most Ss verbalized well during the tests. They 
communicated freely with the interviewers in Mende 
or English. It seems that these findings confirm the 
results of other investigators (1) that conservation 
abilities vary with age of the Ss and with tasks and 
that there are individual differences. 4 


FOOTNOTE 


1. This project was totally financed by the Research 
Grants and Publications Committee of Njala 


University College. 
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ABSTRACT 


To study the relationships among frequency of testing, arithmetic learning and retention, predisposition- 


al test anxiety, defensiveness against admission of test anxie 
were randomly assigned to four arousal conditions: tests ev 
and daily practice, Teachers were randomly rotated daily. 

test was given at the end of the study and again 2 weeks later. 


3 pere 5 t 
each week, On both achievement posttests, the only significant difference was in favor of the r 
Tnduded test anxiety was found to operate similar to predispositional test anxiety, 


over the weekly test group. 


TEST ANXIETY in realistic classroom set- 
tings has usually been studied in a predisposition- 
al sense (8). That is, Ss are given a measure of 
test anxiety before any treatments are applied in 
order to classify them into predispositionally anx- 
ious groups, such as high, medium, and low. One 
or more tasks are then given to Ss to see how the 
predispositional levels of test anxiety relate to 
ongoing performance or achievement, However, 
Ruebush (7) has alsodiscussedhow experimentally 
manipulatable or induced test anxiety can be studied 
in relation to performance, Here, Sscan be random- 
ly assigned to various treatment groups, stress- 
producing treatments of various degrees can be 
applied, and then a measure of test anxiety can be 
given to detect whether or not such affect has been 
induced in them. Because of th € administrative 
complexities involved, induced test anxiety studies 
have rarely been conducted in realistic classroom 
testing situations (4). 


In a pilot study (4) ninety-three third-grade 
arithmetic pupils were randomly assigned to two 
treatments of high predictability: daily testing or 
daily practice, Methods, sex, and IQ (high, me- 
dium, and low) were subjected to analysis of var- 
іапсе, An immediate arithmetic achievement post- 


ty,and induced test anxiety, eighty Sixth-grade Ss 
ery day, tests every other day, tests oncea week, 
The study lasted 5 weeks. An achievement post- 
Induced test anxiety was measured at the end of 
he daily test group 


test was given at the end of the 4-week study and 
again 3 months later. To measure any affects that 
might be induced by the treatments, the Test Anx- 
iety Scale for Children (TASC) and the Defensive- 
ness Scale for Children (DSC) were given at the 
close of the study, The treatments produced com- 
parably high levels of anxiety, defensiveness, and 
achievement in all Ss. One possible explanation of 
the results is that since the daily testing and the 
daily practice stress conditions were both hi ghly 
predictable by Ss after a few days, comparable be- 
havior in the affective and cognitive realms was to 
be expected, 


Toinvestigate further the nature of expe rimentall¥ 
induced test anxiety, defensiveness, and achieve 
ment under realistic learning conditions, the prese 
Study was conducted with Stress-producing trea.” 
ments of both high and low predictability; daily test4 
ing and daily practice (highly predictable by 58) eek 
testing every other day and testing only once а waro” 
(less predictable by Ss), On the basis of past reset 
(4), it was hypothesized that the two highly pre ntl 
able stressful conditions would produce significa’, 
lower anxiety and defensiveness scores than the go” 
less predictable treatments, These hypoth (2 r 
were predicated upon the arousal theory of Heb 


PROGER-MANN-TAYLOR-MORRELL 


PROCEDURE 


The 5-week experiment was conducted in the 
late spring of 1968. The Ss (working N = 80) con- 
sisted of the entire sixth grade of a suburban ele- 
mentary school in a middle-class community in the 
Greater Philadelphia Area. The pupils represented 
an average to above-average ability in composition 
(average fifth-grade, Lorge-Thorndike IQ = 115.00). 


There were three intact arithmetic classes. 
Each Monday, Wednesday, and Friday, these three 
classes received their usual arithmetic instruction 
for an hour following recess in the latter part of the 
morning. On Tuesdays and Thursdays theseclasses 
met for 45 minutes, To control differential prac- 
tice effects from taking or not taking the experi- 
mental tests, the testing process was isolated from 
the usual instructional cycle in arithmetic by having 
Ss receive tests or equivalent practice early in the 
morning each day, The test or practice was on the 
Previous day’s work, To control any effects due 
to differences in presentation of new materialamong: 
the three regular intact classes, Ss of the three 
£roups were randomly assigned to the four stress- 
producing conditions of the experimental te sting 
periods, To control differential effects of teacher 
personalities and efficacies, a random rotation 
Schedule of teachers was used for thetesting period. 
The original three arithmetic teachers were used, 
andthe most competent student teacher became the 
fourth proctor, No tests or practice were given 
on the last day of each school week, which was re- 
Served for an abbreviated version of TASC (5). 


To control further the differential practice 
effects that might otherwise be caused by vary- 


ing test contents on any given day, the senior in- 


vestigator devised an identical worksheet format 
for both test and practice groups during the experi- 
upon the schedules 


mental testing period. Depending 

of testing within thetreatment groups, each of the 
four groups was told that the worksheet was а test 
Or was only practice, as the case might be, and the 
headings on the worksheets reflected these facts 


accordingly. 


vised immediate achievement 
with two items of each type of 


fraction and division problem covered during the 
5-week unit of instruction; see Proger (5) ) was 
given, and the same test was givenasa measure of 
delayedachievement 2 weekslater. The immediate 


achievement posttest had a Spearman-Brown-cor- 
ternal consistency 


Tected, odd-even coefficient of in ‹ tel 
940, 97. To obtain measures оп іһе predisposition- 
al levels of test anxiety on the first day ofthe study, 


€ unabridged versions of both the commonly ас- 
ss [against admis- 


cepted TASC and the Defensivene 
Sion of test anxiety] Scale for Children (DSC) were 
kaei In this study, the 30-item TASC had a split- 
auf reliability of 0.85 (corrected) , and the reli- 
t ay of the DSC calculated ће same way was 0. 16. 
the end of each school week, ап abbreviated test 
nxiety scale (the 12 items from TASC judged to be 


most pertinent to induced test anxiety; see Proger 
Қ and fifth ad- 


5 
mi Wasgiven, Forthefirst, third ў 
split ations of the 12-item scale, the respective 
0 -half coefficients of internal consistency were 


И 
8, 0.80, апа 0,83. 


A specially de 
bosttest (46 items, 
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RESULTS 


The three analyses of variance (the two achieve- 
ment posttest analyses and the induced test anxiety 
trials analysis) will be considered first. АП factors 
in each analysis were considered fixed because of 
the specific factor compositions. The cell frequen- 
cies of the data matrices were unequal because of 
random experimental mortality. Therefore, theun- 
weighted-means approach was deemed more appro- 
priate than the least-squares, unequal-frequency 
method (10:374). The results ofthe twoachievement 
posttest analyses of variance are presentedin Tables 


land 2, 


On the three-way analyses of variance of both 
posttests, the four testing methods were significant- 
ly different at the .05 level. On both posttests, the 
control factor of previous achievement ope rated 
quite effectively at the .05 level in removing vari- 
ance from the error component. In néither analysis 
were the maineffect of sex orthe interaction effects 
significant. Scheffe’s technique for multiple c om- 
parisons (1) showed that the daily test procedure 
was significantly more effective thanthe weekly 
testapproach at the more conservative .05 level on 
both posttests; noother individual comparisons were 
significant, The means ofthe four groups onthe 46- 
item immediate achievement posttest were: daily 
tests, 33.05; alternate days, 24.63; weekly tests, 
22,57; and daily practice, 26.82. Тһе means ofthe 
groups onthe same testgivenasa measure of reten- 

dailytests, 31.35; alternate days, 23. 11, 


tion were: 
weeklytests, 20.41; and daily practice, 24.56. 


The third unweighted-means analysis of var- 
iance to be considered is that of the induced test 
anxiety trial means. Because previous work in the 
area of test anxiety induced by different schedules 


TABLE 1 


UNWEIGHTED-MEANS ANALYSIS OF VARIANCE: 
IMMEDIATE ACHIEVEMENT POSTTEST 


__ ——————=—= 


Source of Sum of df Mean F. 
Variation Squares Square Test 
__——————= 
Methods (A) 892.70 8 297.57 3.58* 
Sex (B) 191.19 1 191.19 2.30 
Previous 

Achieve- 

ment (C) 3191.29 1 3191.29 38,45жж 
АхВ 51.70 8 17.23 <1 
AAC 35.95 3 11.98 <1 
BxC 0.12 1 0.12 <1 
АхВхС 26.09 3 8.70 <1 
Within Cell 5140.96 62 82.92 
ж,01<р<.025 
**p-.005 
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TABLE 2 


UNWEIGHTED-MEANS ANALYSIS OF VARIANCE: 
DELAYED ACHIEVEMENT POSTTEST 


Source of Sumof df Mean Р 
Variation Squares Square Test 


ee ÁO 


Methods (A) — 1017.57 з 339.19 3.714 
Sex (B) 270.20 1 270.26 2.98 
Previous 

Achieve- 

ment (C) 3415.01 1 3415.01 37. 65*» 
AxB 8.68 3 2.89 51 
Ас 15,97 3 5.32 ч 
BxC 0.36 1 0.36 <1 
AxBxC 24.91 3 8.30 <1 
Within Cell 5714.98 63 90,71 


* .01=р < .025 
** p< ‚005 


of testing yielded no significant differences on ei- 
ther sex or previous achievement (4), the factori- 
al model chosen to study induced test anxiety dealt 
only with methods and trials ( see Proger (5) for 
comparable results yielded bythe model with the two 
additional factors of sex and previous achievement in 
it). To accommodate the unequal cell frequencies, 
the unweighted-means, repeated-measures de- 
Sign of Winer (10:376-378) was used. The results 
of this analysis are presented in Table 3, 


TABLE 3 


UNWEIGHTED-MEANS ANALYSIS OF VARIANCE: 
INDUCED TEST ANXIETY 


-——————— 


Source of Sumof df Mean Е 
Variation Squares Square Test 
NN 
Between: 83: 
Methods (A) 50.02 8 16.00 <1 
Ss Within 
Groups 2701.28 80 33,77 
Within: 336: 
Trials (В) 99.63 4 24.91 15.87% 
AxB 9.21 2 олт —1 
Вх Ss 
Within 
Groups 501,72 320 1,57 
* p-.005 


Only the main effect of trials dde 
test anxiety scale administrations) was signifi 
at the .05 level. The weekly overall induced in 
anxiety averages were: trial 1, 4.02; trial 2, Es 
trial 3, 3.27; trial 4, 2.70; and trial 5,2.73. T 
greatest decrease in anxiety occurred between the 
first and second administrations (p = .005). кі: 
nally, since there was no significant interaction, 
only the overall trends were calculated as in Winer 
(10:273-275) . There wasa highly significant lin- 
ear decrease at the .05 level (Е = 57,20, p=.005), 
but the quadratic, cubic, and quartic trends were 
all insignificant, 


The last analysis undertaken was to study the 
interrelationships among predispositional defen- 
siveness against admission of test anxiety, induced 
test anxiety, and immediate posttest achievement. 


The product-moment Coefficients of correlation are 
given in Table 4, 


The matter of induced test anxiety will be 
considered first, Contrary to h 


anxiety and defensiveness, Further, all four treat- 
ments demonstrated compa 


ing this second explanation Was provided by the 
teachers themselves—they noted that the pupils 
vocally expressed their displeasure and boredom 
with answering «һе same old questions" on test 
anxiety each week of the study. 


A final note on the Weekly decreases of in- 
duced test anxiety seems in Order, Thela rgest 
decrease occurred between the first and secondad- 
ministrations of the abbreviated TASC, In weekly 


between the first and Second trials, The measure- 
ment of periodic change has always been a delicate 
matter, fraught with difficulties, For cognitive re^ 
testing, one might have to deal with test-wiseness- 
In this study, for affective retesting, instrument 
desensitization Seems to be most pertinent, How- 
ever, the methodology of repeated testing in Hm 
field of emotionality is not yet well enoug 
understood to provide detailed explanations. 


Further inferences about the natureof induced 
test anxiety as it operated in this study can be dra 
from the intercorrelations in Table 4. Ав ошо 
expected on the basis of past studies at the € 
mentary school level (8), the correlation of 2022 
between immediate posttest achievement and РБ 
dispositional test anxiety is significant at the - 


E — 
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TABLE 4 


81 


RELATIONSHIPS AMONG IMMEDIATE POSTTEST 
ACHIEVEMENT (I. 4.), PREDISPOSITIO: 
ANXIETY (TASC), PREDISPOSITIONAL DEFENSIVENESS (DSC), AND INDUCED TEST ic тон 


(A-TASC 1, A-TASC 3, AND A-TASC 5) 


LA. TASC DSC A-TASC 1 A-TASC 3 A-TASC 5 
І.А. 1.00 -0.19 -0.002 -0.16 -0.15 -0.30 
TASC — 1.00 0.31 0.72 0.60 0.54 
DSC — -- 1.00 0,21 0.16 0.13 
A-TASC 1 — — -- 1.00 0.78 0,76 
A-TASC 3 --- — -- -- 1.00 0.90 
A-TASC 5 кештен — ===: — = — 1.00 

*N-80; significance was determined by t-test (9:281), testing Hy г = 0.00 against the appropriate one- 

i itional test anxiety and predispositional defensiveness studies). 


5 ; 
sided alternative (on the basis of past predisposi! 
For N=80, the critical values of г and the correspon 
0.005) , 0.26 (р-0.01), 0.22 (р-0.025), 0.19 (p-0 


as found in the third-grade fre- 
nn and others (4), ће 
diate posttest achievement 


level, Similarly, 
quent testing study of Ma: 
r = -0.002 between imme! 
апа predispositional defensiveness is insignificant 
(р> .20). A third result that supports the usual 
expectations under a predispositional test anxiety 
framework is the т = 0.31 between TASC and DSC 
(p= 0.005), However, to the investigators’ know- 
ledge, no studies to date have related levels of pre- 
dispositional anxiety to levels of induced anxiety 
under realistic, longitudinal learning conditions. 
The correlations of every other week's induced test 
anxiety levels, as measured by the abbreviated 
TASC (A-TASC 1, A-TASC 3, A-TASC 5), with 

€ original predispositional anxiety levels, as 
measured by TASC, are also given in Table 4, In 
general, there is a relatively high relationship be- 
ween induced and predispositional anxiety. 


bedi Finally, the induced achievement results will 
discussed, In summary, it was found that only 
1 е program of daily testing resulted in significant- 
Y higher achievement at the .05 level in both post- 


tests. The alternate day test group, ће weekly 
ЗЕ group, and the daily practice group were not 
evel on either 


pinificantly different at the .05 1 ; 
sttest, A graphical summary of the i nduced 


achievement results is shown in Figure 1. One 
of methods groups 


Should note that the arrangement T 
of test horizontal axis in order of increasing amount 
esting does not imply equal intervalsor degrees 
testing, 
9f th In our present state of knowledge, explanations 
Produ Psychological mechanism which operated a 
P"Oduce the achievement curves in Figure 1 mus 
is ie only conjecture. The following discussion 
Phong ТИВ of physiological and affective arous 
Fira ena, as might be inferred from Hebb (2). 
our, ne must remember that each of the four 
Жор in the present study was told оп the day the 
Sheets were distributed whetherthe worksheets 


ding levels of significance аге: 
.05), 0.15 (p-0.10), and 0.10 (р=.20). 


0.36 (p=.0005), 0.29 (p= 


FIGURE 1 


COMPARISONS OF ACHIEVEMENT ON 
IMMEDIATE AND DELAYED POSTTESTS FOR 
FOUR METHODS GROUPS IN ORDER OF 
INCREASING DEGREES OF TESTING, POOLED 
ACROSS SEX AND PREVIOUS ACHIEVEMENT 
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were to be counted as practice problems or as a 
test. The exact schedule of testing that any par- 
ticular pupil was under was not made known to him 
at any point in the experiment. This procedure was 
necessary to control test expectation throughout all 
four groups. It is likely that after the first fewdays 
of the 5-week experiment, the pupils in the daily 
test group and the daily practice group inferred that 
they would be under a continuous schedule of tests 
or practice, respectively. Thus, while all four 
groups might have been under some form of emo- 
tional stress with respect to expectation the first 
few days, the daily test group and the daily practice 
group probably attained an optimal expectation 
arousal state. On the other hand, the alternate 
day test group and the weekly test group wereunder 
less obvious schedules of testing; the likelihood that 
these pupils could infer the schedule of testing they 
were under was perhaps decreased because of the 
interruptions by the 2-day weekends and the ad- 
ministrations of the test anxiety scale on the last 
school day of each week. Perhaps the least obvious 
schedule of testing was that of the weekly test group. 
This may account for the poorest performance of 
this group out of all four groups, 


It may be, using the arousal (or drive acti- 
vation) theory of Hebb (2:249-250) , that the pupils 
in both the alternate day test group and the weekly 
test group were aroused to an excessive state of 
activation (with an unstable expectation state per- 
haps playing a crucial role in bringing about this 
arousal); thus, they could not perform as well as 
they might under more predictable conditions; these: 
Ss might be subject to “increasing emotional dis- 
tubance, [and] anxiety" (2:250). On the other 
hand, the Ss in the groups undergoing the relative- 
ly stable expectation condition (the daily test group 
and the daily practice group) probably had not 
exceeded their optimal arousal level (that which 
produced the best performance) and hence had not 
yet lost as much relative “alertness, interest 
[and] positive emotion” (2:250). 


IMPLICATIONS 


The formal study of induced affect, as ma- 
nipulated by different treatments in realistic set- 
tings, has been a neglected area, The methodology 
of the measurement of change in affect over time 
is complicated. With repeated administrations of 
the same measure of affect, there is the usual con- 
tamination problem of prior measures on Succes- 
Sive ones, There is the much less commonly rec- 
ognized problem of purging of affect, which all 
too often gives rise to the Spurious conclusion of 
no difference among treatments with respect to in- 
duced levels of affect, In this Second problem, af- 
ter termination of a relatively long application of 
treatments, the view can be offered that any im- 
mediate posttests of cognition (given along with the 
measures of affect) signal to the subject the ob- 
vious end to any stress-reducing treatments; thus, 


was present in this study, itis Suggested that more 
emphasis be given to in-process measures of af- 


fect in longitudinal Studies, as 
and post-measures, И "Mood а 


Finally, опе can ask what might be done in 
future research on the cognitive aspects of the 
Stress-producing treatments used in this study. 
For example, the daily test group of the present 
experiment has shown that tests in and of them- 
selves at the elementary school level can teach Ss] 
material above and beyond only practice situations; 
the problem then arises as to exactly how such con- 
tent learning arousal effects take place, Perhaps, 
adopting the methods used by the **test-like event 
investigators would enable one to attack this prob- 
lem. Basically, “test-like events" are study- 
guide questions inserted into reading passages ог 
assignments when given in class (that is, thisap- 
proximates a test situationin itsevaluative aspects 
аз compared to Study guide questions givenas out- 

Side-class homework where the study situation 15 
relatively informal and non-test-like), For ex- 
ample, Rothkopf and Bisbicos (6) have dealt essen- 
tially with completion-type review questions inserte 
into the text itself, The advantage in such proce- 
dures is that one can gain а £reat deal of controlin 
дүш an effect Such as content Structuring in 
est-like" situations, that was hitherto unavailable. 
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THE DESIGN OF CORRELATION STUDIES 


KEITH Е. PUNCH 
The Ontario Institute for Studies in Education 


ABSTRACT 


Correlation techniques will continue to be widely used, because of the 
cational research, Within this constraint, the general aim of m 
natory theory, and the operational aim of accounting for varianc 
bivariate correlation studies, This in turn points to multiple li 
tool, the more so in view of its power to answer as well typical 
cases, bivariate correlation studies are unavoidable, they shoul 


1 ех post facto nature of much edu- 
oving from description to the building of expla- 
€, point to the need for multivariate ratherthan 
near regression as the basic designand analysis 
analysis of variance questions. If, in isolated 
d systematically prepare for later multivariate 


Studies, The error variance of measures used should in all cases be estimated and reported, given the aim of 


accounting for variance. 


MUCH EMPIRICAL research in education, and 
in social science generally, is necessarily ex post 
facto in nature, This, in the general case, means 
correlation studies, and explains the widespread 
use of correlation techniques in reported research. 
It is not necessary to go into the usual laments 
about the deficiencies of ex post facto research and 
correlation studies, There are, of course, prob- 
lems. But at least such studies, by not manipulat- 
ing variables, admit a truer, less artificial view 
of the world of interest than do studies in the clas- 
sical experimental design. What is needed, for 
better ex post facto research, is a set of tech- 
niques for handling complex multiple relationships, 
and logical checks and balances in the interpreta- 
tion of relationships observed. Giventhe importance 
of correlation techniques, then, this article deals 
with some of the problems and strategies involved 
in developing andusing techniques for handling com- 
plex multiple relationships, Though the points 
raised have design implications, the discussion is, 
for simplicity, cast mostly in data analysis terms, 


The usual problem in data analysis is to obtain 
maximum information from the data, while not bruis- 
ing them unfairly in quest of support for hypothe- 
ses. Here, by contrast, the reverse kind of 


relationship between two variables. One of these, 
implicity or explicity in the investigator’s concep- 
tualizing, is seen as ‘dependent’? or “criterion, ” 
the other as “independent” or *‘predictor.’’ Usu- 
ally, though with varying degrees of plausibility, the 
data are assumed to meet certain conditions, and 
the product-moment coefficient, r , іѕ used, The 
rules regarding the use of this coefficient are well 
documented, if not always well followed, Thus, for 
example, predictor variable x and criterion vari- 
able y are found to correlate +0, 3, which, with a 
sample size of, say, 100, is significant beyond the 
.01 level. This permits Strong confidence іпіһете- 
jection of the null hypothesis p = 0, andinthe accep- 
tance of its alternative р>0. The report then 
typically moves to a discussion of the implications 
of the relationship just established, 


Now there is nothing wrong with this procedure 
in itself. By not going far enough, however, it еп” 
hances the possibility of invalid interpretation ап 
inference, The point isthat while r isa useful өші” 
mary index of the relationship, r^ —the square 0 t 
the correlation coefficient—gives an estimate of the 
part of the variance in y held in common with, 95, 
accounted for by, variance іп х, This point ое 
been well documented in the literature (4), inclu 
ing its mathematical proof (3). In the above hye 
thetical example, then, an г of «0,3 has almo 
certainly not come about through sampling eT I. 
and indicates a real relationship in the popula 


/ 
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ye consideration, Yet with 17 = .09, a mere 
ы percent of the variance in y is accounted for by 
ariance in x. If small or zero error variance in 
ы is assumed, this leaves some 90 percent of its 

ariance untapped. This pointistoo oftenunrecog- 
nized, and hence is omitted. 


- The possibility of invalid inference is clear, 
1 er and above any «<eorrelation-causation”’ prob- 
ет. That is, invalid inference is likely even if 
we establish, or assume, а casual influence from 
x to у. While x and у are clearly related, it is 
idle to consider changing X asa strategy for chang- 
ing у. Yet this may well be the interpretation giv- 
enor take n, especially in practitioner oriented 
DUE It is the more likely when, aS is usual, 
he report underlines the statistical significance of 
the observed r and stresses and discusses the re- 
lationship, but neither mentions nor interprets the 
r'. In the above example, we really know very lit- 
tle at this stage about what varies with y, casually 
or not, given that 90 percent of its variance is un- 


investigated. 


ed research, the problem is 
ationship does hold, but the 
attern of variation in у remains virtually 
isi f the basic ге- 
search problems. Ultimately, research seeks an- 
swers to questions of the kind «What causes y т^ 
For reasons of strategy; thisis rephrased as ‘What 
causes y to vary?" To avoid questions of causa- 
tion, this in turn is rephrased 2S «What is the 
pattern of variation of y?" Thatis, what variables 
are associated with y, orhold variance incommon 
with y? In other words, research seeksto account 
for as close to 100 percent of the variance in the y 
under consideration as is possible. Seen inthis 
light, and acknowledging multiple causation, to 
delineate isolated 2-variable relationships is to 


make only the first research step. 


hile r may be highly sig- 
may be very small. In- 
deed, with large enough sample size, а correlation 
of .01 reaches statistical significance, but the pro- 
portion of variance accounted for is negligible. In 
the face of this problem, two strategies are pos- 
sible when it comes to designing research. Which 
should be used depends on the amount already 
known in the area of concern. The first applies 
when very little can ре claimed in advance about 
variation inthe у under study. The second ap- 
plies when the influence of variables other than tae 


x on any particular y isknownor suspected. 
i nce is more 


keep in mind that accounting for variar s 
the goal than testing for isolated relationships- 


In substance, then, Y' 
nificant statistically, r 


Consider the case where W wish to investigate 
the relationship between any particular x and у. 
We suspect that, while the relationship will hold, 
x will at best account for only a small 
of the variance in Уу. 
S propose Ту 
ienificant correla . 

аге Made comer something like. «тапдот 
Speculation” to identify these other variables, ОГ 
with the mechanical use of standard, easy-to- 
Collect, base line data (sex, age, 19, 800402 
Economic status, etc.» variables used far 
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often and far too mechanically) . Neither alterna- 
tive belongs їп а study designed for economy of, and 
maximum output from, data collection. Inthis case 
the study should be programmed so that the x-y ai 
lationship is investigated as a first step, but һе 
researcher returns to the sample, on the basis of 
the correlation results, to search systematically 
for the other correlates of y. 


The rationale is simple. Assume that the analy- 
sis produces the hypothetical figures used above 
(xy = +0.3, N= 100). It is now possible to rank 
order the one hundred Ss according to their stan- 
dard scores on variable x, and by dividing them at 
the median, to designate fifty as high and fifty as 
low. Independently of this designation, the hundred 
Ss can likewise be ranked according to standard 
scores on variable у, with fifty designated high and 
fiftylow. Each S now has a 2-Way classification, 
either (1) high-high, (2) high-low, i 
or (4) low-low. Clearly, it is groups 
bring about whatever positive correlation exists, 
and groups 2 and 3 which work 
groups have now been identifie 
bute to the correlation, and those who detract from 
it. The researcher can now return to the sample 
to search systematically for differences between 
the two groups. If only groups 1 and 4 were used, 
the computed correlation would be stronger, and 
positive; if only groups 2 and 3wereused, it а 
be weaker and perhaps negative. This suggests the 
question: what differentiates Ss inl and 4 asa 
group, from those in 2 and 3 as а group? To an- 
swer this question is to indentify other potentially 
important predictor variables, of whatever kind. 
The next study jn this area can thenincorporate these 
variables into the investigation for greater payoff, 
in terms of the proportion of variance 


in y «explained. "' 


The obvious difficulty here is the practival one 
of returning to the sample. Yet this kind of analy- 
sis would increase the effectiveness of bivariate 
Since such studies represent 
ch step, they ought to at- 
ide direction for later multivariate 
studies in the particular area. Clearly, the above 
procedure can be mo: Thus, 
for example, 


nd those which do not, 
to be reinvestigated. 


idends from а pi rch necessarily limit- 
ed to investigating 2 2-variable relationship. 
not be necessary. Omitting 
practical problems of data. collection, how often 
must research, because of insufficient knowledge, 
consider the relationship between onlytwo variables, 
between only one predictor x and the criterion y? 
At least not very often in educational research, and 
for two reasons. Firstly, in any 
al of knowledge and near- 
ented and uncertain. The 
ically exploit all that is 
Further, and вес- 


But all of this may 


is a great de 


86 THE JOURNAL OF EXPERIMENTAL EDUCATION 


ondly, he must adequately analyze his problem. 
An orientation toward explanation through 
theory building as the goal of scientific 
research, rather than only description and 
documentation, is essential here, Thus an 
hypothesized relationship between xandy 
Should be built on some basis, and that basis 
should be exposed and analyzed. Thequestionthen 
becomes: Why should this relationship exist, if it 
in fact does? Research too often remains at the 
level of mere description, whereas the goal of 
Scienceisthe building of explanatory theory, Given 
atight, logical structure ofthe type *Ifthetheoryis 
true, thenthe hypothesized x-y relationship fol- 
lows," testing the hypothesis by documenting the re- 
lationship becomes the vehicle for confirming or re- 
jecting the ћеогу.! Anadequate explanation for the 
hypothesis will normally involve (a) specifying the 
intervening variables by which x isconnectedto y, 
(b) identifying the conditions under which x will be 
more orless strongly relatedto y, and sometimes 
(c) showingthe proposed relationship to be an in- 
stance of some wider generalization. Points a and 
b force consideration of the other possible predictors 
of y, either inconjunction with x or independent of it, 
An adequate analysis would indicate whether theyare 
to be seenas additional predictors, or as control 
variables, Withadequate operationalization, they 
canthen be incorporated into the research. 


The point then is that proper problem analysis, 
and attention to theory building prior to structur- 
ing and gathering data will convert most bivariate 
studies into multivariate ones. Multivariate studies 
require multivariate techniques. Assume we now 
have one criterion Briable y, and a series of pre- 
dictor variables x Ó x 9 1... x € For such 
cases, multiple linear regression is the general, 
basic design and analysis tool. Anexpositionofthe 
technique is available from numerous sources (1, 
for example). For present purposes, we should 
note that three important kinds of questions can be 
answered, 


(i) We canestimatethe proportion of variance 
in y accounted for by, or held in common 
with, all predictors x ® x 9,,, x (9, 
considered together. Just as one predictor 
variable study yields r and more impor- 
tantly г, so multiple predictor studies 
yield R, the multiple correlation, coe ffi- 
cient, and, more, importantly, its Square 
R*, TheR?, of course, will be between 
0 and 1, and its statistical significance is 
testable through F, It represents the most 
direct answer to the question of account- 
ing for variance, proposed earlier as one 
of the basic research questions, 


Gi) Within the limits of the answer to (i), we 
can begin to assess the relative orderofim- 
portance among the predictors in account- 
ing for variance, There are at least two 
Ways of doing this, One is by adding (or 
deleting) predictor variables in prediction 
Systems using Stepwise regression proce- 
dures, to determine the contribution to 
prediction of the variable added or deleted 
over and above that of other predictors,? 
The problem here is that with interrelated 


predictors, the relative importance of each 
will likely depend on the order in which it 
is entered into, or dropped from the re- 
gression model. No statistical solution 
exists to the problem of ‘‘the correct" or- 
der for entering predictors intoa regression 
model. The researcher may, however, be 
able to justify a conceptual (or perhaps, 
temporal) ordering among the predictors. 
If so, or in the unlikely event of unrelated 
predictors, the stepwise routine cangivethe 
proportion of variance associated with each 
predictor, within the cumulative limits of (i) - 
The other approach is through the standard- 
ized partial regression coefficients, or beta 
weights. If for example the beta weight for 
predictor x ® is ‚45, and that for x @ is 
.06, and the difference betweenthese weights 
is significant, x is more important a de- 
terminant of y than is х ©) This is so 
because a beta weight indicates how much 
change in y is produced by a standardized 
change in the particular predictor variable 
with other predictors held constant, Which 
of the two ways to use should depend on the 
interpretation given to **the relative impor- 
tance of predictors.” A precise operational 
version of this phrase would, inmost cases, 


point to the second and neater method, the 
beta weight analysis, 


(iii) We can test hypotheses of the kind nor- 
mally tested only by analysis of variance and 
analysis of covariance procedures, In gen- 
eral, such hypotheses will deal with differ- 
ences between groups onthe criterion, where 
the sample units are assigned to groups ac- 
cording to some univariate or multivariate 
classification system. It has only recently 
been shown that analysis of variance and of 
covariance may be seen as special cases of 
multiple linear regression analysis (2). 1t 
follows then that the more general technique 
will be more useful, especially since it ap- 
pears that multiple regression is easier for 
the average researcher to use and applythan 
are the traditional procedures, Thus hy- 
potheses dealing with between—group differ- 
ences can be tested within the context of the 
important questions of(i)and (ii) using re- 
gression analysis, Furthermore, both con- 
tinuous and discontinuous predictor vari- 
ables can be handled in the Same analysis. 
And, in the process, the sc alability of 
"doubtful" variables can be easily deter- 
mined using the hypothesis testing function 
of regression analysis, This may bea most 
important payoff, given the uncritical ac - 
ceptance of the assumptions involved in or- 
dinal measurement, and hence the large 
number of educational research variables 
whose scalability is doubtful, 


АП three types of question are important, and, 
relatively easy to answer using multiple linear ^o 
gression analysis, And because it can be used > 
assess between group differences onthe criterio? 
whether univariate or multivariate classificati? j- 
determines the groups, and whether or not СОЎ js 
ates are considered—regression analysis appe? 


| 
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as the appropriate analysis tool both for ex post 
facto research and for research using the Iess typ- 


ical experimental or quasi-experimental designs. 
The rationale behind the technique is easily 
understood, and the technique permits ready com- 
munication with the computer. Its use will reduce 
reliance, in both design and analysis, ontradition- 
al, often over-rigid statistical formulas and pro- 
cedures, Further, as has been sugge sted, an 
orientation toward regression analysis as the 
mainstream tool will assist in transforming most 
bivariate studies into multivariate ones. Not that 
this orientation itself will indicate which other 
variables to consider, That should be dictated by 
problem analysis, and by previous research and 
knowledge, integrated by the researcher intoa uni- 
fied, consistent, and clear theoretical framework. 


be mentioned in this 


One further point should 
esign of correlation 


general discussion of the d 
studies. It is important to know the probable 
amount of error variance in all variables the study 
measures, but particularly in the criterion vari- 
able, Error variance is, by definition, spurious, 
random or untrue variance—that is, variance which 
cannot be accounted for. It is a basic research 
problem, to repeat, to account for as much of the 
variance as possible in the particular criterion 
variable of interest, using its relationships with 
predictors. If criterion error variance is high, 
there is clearly something less than 100 percent 
of its variance to be accountedfor, Error vari- 
ance is estimated by the use of reliability coeffi- 
cients, however computed, This, then, isone rea- 
son why reliabilities for all measurements used in 
mated and reported, espe- 


research should be esti A 
cially for those measurements which representthe 


operational ve rsion of the criterion variable 
of interest. 
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FOOTNOTES 


1. The term **theory" only means here a set of 
logically consistent and plausible propositions 
which serve to explain, in an if-then sense, 
the hypothesis. With the if-then setup, itis 
clear that theoriesare never proved this way, 
because of the logical fallacy of “affirming” 


the consequent.’’ 

2. The rationale is simple: F is used to test the 
difference between the R^ from the model 
with the predictor, and that from the model 
without the predictor, with other predictors 
in both models. 
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SOCIAL CLASS, OCCUPATIONAL ASPIRATION, 


AND OTHER VARIABLES’ 
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ABSTRACT 


Measures of occupational aspiration, perception of occupational presti 
of failure of 179 high school boys were investigated with Ss’ socioeconomic 
ANOVA on a 3x4 factorial design and multiple comparisons showed 
lower socioeconomic groups had Significantly lower occupational aspiration and more distorte 
cupational prestige hierarchy than Ss from the middle class and that among the ninth 
group possessed significantly higher achievement motivation than the lower and 1 
among Ss from the lower-lower class, the twelfth grade group showed significant: 
Analyses of covariance were also саг: 
sign with regard to occupational aspiration and perception of occupational 


ables. 


tion than the ninth and tenth grade groups. 


ables were factor analyzed and results discussed. 


THAT CERTAIN personality variables relate 
to socioeconomic status has long been recognized by 
social scientists. In earlier work by the National 
Opinion Research Center (7), itwas established that 
perception of the occupational prestige hierarchy was 
positively related to socioeconomic class. Beilin 
(3) reported that there was an availability of high 
level talent in the lower socioeconomic groups but 
problems existed in developing this talent because 
many lower-class individuals did not attempt to get 
the necessary education or training needed for high 
level jobs. Tseng and Thompson found that students 
of lower class tended to select lower level occupa- 
tions (9), and that significantly fewer students from 
the lower class sought counseling (8). 


The primary purpose of the current study was to 
investigate whether male high school students of var- 
ious socioeconomic groups differ significantly in 
occupational aspiration, perception of the occupa- 


tional prestige hierarchy, achievement motivation, 
and fear of failure, 


ower-lower groups, whereas 


d perception of oc- 
grade Ss, the middle-clas$ 


ly higher achievement motiva- 
ried out on a 3x4 factorial de- 
prestige and the four dependent vari- 


need achievement) is the degree of competitiveness 
for excellence present in a given individual which is™ 
viewed as the motive to approach success, whereas 
fear of failure or anxiety level aroused by the 
Success-failure cues is considered as the motive to 


avoid failure. 


Specific questions examined in the study were ав 


follows: 


l. What are the Similarities and differences 
among the middle, lower, and lower-lower so- 
cloeconomic groups with regard to occupational 
aspiration, perception of occupational prestige, 
achievement motivation, and fear of failure? 


2. What are the Similarities and differences 
among the ninth, tenth, eleventh, and twelfth 


grade male students in terms of the four vari- 
ables? 


3. Is there a significant interaction between 
the socioeconomic status and grade level in re- 
lation to the four dependent measures? 


SUBJECTS 


B 5 
The population, consisting of 5,600 persons, 54: 


defined аз all ninth, tenth, eleventh, and twe! 


ge, achievement motivation, and fear p 
and grade levels as independent vari- 
that Ss from the lower and lower- 


i 
| 
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grade male students and male drop- 

І p-outs who dropped 
out during the school year 1966-67 and were NU 
nent residents of McDowell County, West Virginia. 


Out of a sample drawn at random, 179 boys who 
were non-drop-outs and provided all the data includ- 
ing the grade level and social class data as well as 
the four dependent measures were selected as the 
55 of this study. 


, Of this group of 179, there were twenty-nine, 
fifty-four, forty-seven, and forty-nine students en- 
rolled in the ninth, tenth, eleventh, and twelfth 
grade, respectively. Among these subjects fifty- 
eight were in the middle class, sixty-four were in 
the lower class, and fifty-seven were in the lower- 
lower class. 


The classification of socioeconomic level was 
made on the basis of father’s occupation and father’s 
and mother’s educational level. National Opinion Re- 
search Center (NORC) scores of fathers’ occupations 
ranging from 1 to 49 were classified as being in the 
middle class, from 50 to 76 belonged to the lower 
class, and from 77 to 90 belonged to the lower-lower 
class. The NORC score was obtained by assigning 
the ranking of the occupation in terms of its prestige 
level as classified by the National Opinion Research 
Center (7) with, for example, 2 representing physi- 
cian, 10 representing banker, 60 representing plumb- 
er, and so forth. The smaller the value of the NORC 
score the higher the level of prestige of the occupa- 
tion. The cutoff points for father's and mother's ed- 
ucational level were as follows: high school gradu- 
ate and above, middle class; from ninth to eleventh 
grades, lower class; and grades 8 and below, lower- 
lower class. Meeting at least two out of the three 
Criteria mentioned above were necessary for a S to 
be classified as being in a given social class. 


INSTRUMENTS 


The instruments used in this investigation were 
à questionnaire which collected data concerning the 
S's age, race, grade level, father’s and mother’s 
educational levels, and father’s occupation; Haller’s 
Occupational Aspiration Scale (OAS); the NORC Oc- 
cupational Prestige Scale (OPS); the McClelland’s 
(6) Need Achievement Thematic Apperception Test 
(TAT); and the Mandler-Cowan’s Test Anxiety Ques- 
tionnaire (TAQ) for High School Students. 


The OAS (4) is an 8-item multiple-choice instru- 
еш designed primarily for use among male high 
коо students. The total score is interpreted as 
р relative indicator of ће prestige level of the occu- 
rip hierarchy which an individual views FA а 

e ili i i orte e 

About Қы reliability of this scale is rep 

ES he OPS consisted of twenty occupations which 

OR Selected from the list of ninety used in the 
thes 9 Study (7). Subjects were instructed to ux 
а enty occupations on the basis of their opin 
Scor 270 which occupation п st gems m. 
Yideq |5 Vas done by subtracting the ideal г p 
Vice уау the scale from the ran 
Was the,» for each occupation. - 
for ар Obtained by adding the discrepancy 5сол 
deviati, e twenty occupations- This represe aa 

Оп of the S's perception of occupati 


. significant differenc 
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prestige in relation to the social norm. 


The TAT consisted of four pi 

. pictures (1 

in 4 neutral classroom situation. е ее 
у two trained graduate students with i - 

M ith inter-rater re- 


A short form of TAQ (5) consisted of thi 

th - 
items. It correlated .946 with the 48-item (mea 
Each item was graded on a 9-point scale with 1 rep- 
resenting low anxiety level and 9 representing high 


anxiety level. 
RESULTS 


With social class ( middle, lower, and lower- 
lower) and grade level (9, 10, 11, and 12) as two 
factors, analyses of variance on a 3x4 factorial de- 
sign were carried out with regard to the four depen- 
dent measures—occupational aspiration, perception 
of occupational prestige, achievement motivation. 
and fear of failure. Results of these analyses are 


shown in Table 1. 


Social class as an independent variable was found 
to be the only significant main effect (p<, 001) for 
both occupational aspiration and perception of occu- 
pational prestige. А significant interaction effect 
(p<. 05) was found on achievement motivation, but 
no significant main effects or interaction were found 


on fear of failure. 


The significant interaction found on achievement 
motivation indicated that a given social class had 
different effects for one grade level of Ss from what 
it had for other grade levels and that a given grade 
level had different effects for one social class of Ss 
from what it had for other social classes. In order 
to test the simple effects of the social classes for 
each of the grade levels and those of the grade lev- 
els for each of the social classes, variance analyses 
were carried out. з Table 2 summarizes the results. 


It was found that there were significant differences 
between social classes for grade 9 and that there were 
ев between grade levels for the 
lower-lower class, as far as achievement motivation 
was concerned. 
amine the mean differences of the 
ocioeconomic groups (M, L, LL) on occupa- 
rel ‘aspirat i n of occupational pres~ 
tige as well as the mean differences of the three 50- 


i inth grade subjects (9M, 9L, 
cial classes of the ninth £ deis ah wien 


and the four grade leve 

ed Ss (9LL, 10LL, 11LL, 12LL) on achievement 
motivation, the Duncan’s New Multiple Range Tes 
was used. ' Results of these multiple comparisons 
are given in Table 3. 

e ic f the 

mean occupational aspiration score 0! 

Fe was significantly higher (р<. 05) than those 
of the L and LL groups. as no significant 4 
mean diffe еп the L and LL groups on oc- 
cupational aspiration. с Ss from the 
lower socioeconomic classes showed significantly 
lower occupational aspiration than those from the 
middle class- 
eption of occupational 


e of the perc 
cantly (p<. 05) lower 


ean scor es 
The m M was signifi 


prestige of group 
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TABLE 1 


ANALYSES OF VARIANCE OF FOUR DEPENDENT 
VARIABLES 


S0-—GGan@ananGaaQ@9B]OO9>=™ 


Variable and Source dt MS Р 
Occupational aspiration 
OAS 
Social class 2 2029.21  16.62* 
Grade level 3 214.18 1.75 
SxG 6 191.82 1.57 
Residual 167 122.10 
Occupational prestige 
(0р5) 
Social class 2 12006.65 9.11* 
Grade level 3 889.55 0.68 
SxG 6 1278.11 0.97 
Residual 167 1318.66 
Achievement motivation 
(TAT) 
Social class 2 36.99 1.12 
Grade level 3 59.71 1.81 
SxG 6 70.72 2.15** 
Residual 166 32.96 
Fear of failure 
TAQ 
Social class 2 2760.23 2.54 
Grade level 3 235.89 0.22 
SxG 6 808.19 0.74 
Residual 164 1089.05 
*p<.001 
**р< ‚05 


than those of groups L and LL which were found to 
be homogeneous. It appeared thatas the social class 
of the group shifted away from the middle class to- 
ward the lower socioeconomic levels, the Ss’ per- 
ception of occupational prestige hierarchy became 


significantly more distorted from a national norm 
(7) standpoint. 


Significant mean differences (p< .05) found in 
achievement motivation are as follows. Among the 
ninth grade Ss, 9M possessed Significantly higher 
achievement motivation than the 91, and 9LL groups. 


TABLE 2 


ANALYSES OF VARIANCE OF ACHIEVEMENT 
MOTIVATION FOR SIMPLE EFFECTS 


= 
Source df MS F 
——<$<$<$<$_—<__@ М8 к 


Social class for grade 9 


2 151.50 4.59* 
Social class for grade 10 2 36.50 1.11 
Social class for grade 11 2 14.50 0.44 
Social class for grade 12 2 34.00 1.03 
Residual 166 32.96 
Grade for middle class 21.67 0.66 
Grade for lower class 74.67 2.26 


3 
3 
Grade for lower-lower class 3 
Residual 6 


TABLE 3 
MULTIPLE COMPARISONS AMONG MEANS WITH 
= .05 
OOOO ті 
Variable Mean Comparison* 
L LL M 
Occupational 2 
aspiration 37.1 39.0 48.0 
(n=64) (п-57) (п-58) 
м L LL 
Occupational 
prestige 66.7 90.8 92.0 
(n=58 ) (п-64) (л-57) 
9LL 9L 9M 
Achievement U 
motivation 5.7 7.4 13.1 
(п=6) (n=9) (n=14) 
9LL 1011, 1111, 12LL 
== ILL 
Achievement 
motivation 5.7 8.8 10.1 13.1 


(n=6) (n=25) (п-9) (n=17) 


Significantly different at the .05 
level. 


Whereas, among the Ss from LL Social class, the 


twelfth grade Ss showed Significantly higher achieve- 
ment motivation than the 9LL and 10LL Ss, 


In order to determine the extent to which each of 
the four dependent variables might be related to the 
others, correlational analyses were carried out. 


The resultant product-moment correlation coeffi TN 
cients are shown in Table . 


The only nonsignificant correlation coefficient 


found was that (-.10) between the scores of achieve- 
ment motivation ( 


ж 
TABLE 4 
INTERCORRELATIONS (N=176) 
Е Occupational Occupational 

Variable aspiration prestige ТАТ тА@ 
Occupational 

aspiration - 
Occupational 

prestige -.50*жж - 

AT .22* -.26** s 
TAQ -.26** .25** -10 23 
*р<, 05 
Чр< 01 


ii 001 


TSENG 


TABLE 5 
ANALYSES OF COVARIANCE 


Occupational aspiration 
criterion 


Occupational prestige \ 
(covariance 


Social class 2 950.23 8.88% 
Grade level 3 320.38 2.39 
SxG 6 124.13 1.16 
Regression 1 3801.78 
Residual 166 107.02 
Occupational prestige 
AEAT) 
Occupational aspiration 
(covariance ) 
Social class 2 2306.31 2.11 
Grade level 3 1757.75 1.61 
SxG 6 930.10 0.85 
Regression <1 38819.77 
166 1092.75 


Residual 


*p<. 001 


Occupational aspiration and distortion of percep- 
tion of occupational prestige hierarchy were found to 
have a rather high and negative correlation (г=-.50, 
р< ,001. Since occupational aspiration and percep- 
tion of occupational prestige seemed {о covary, anal- 
yses of covariance on а 3x4 factorial design were 
conducted with regard to each of the two variables 
using the other as the covariate. Table 5 presents 


the results. 
ation Was adjusted in 


When occupational aspir 
the Ss’ perception of oc- 


terms of the differences in 
cupational prestige hierarchy, exactly the same find- 


ings as revealed by the earlier variance analysis re- 
sulted. In other words, social class Was the only 
significant main effect and Ss from the lower and 
lower-lower socioeconomic groups showed signifi- 
cantly lower occupational aspiration than Ss from 
the middle class, with differences in their percep- 
tion of occupational prestige being controlled. 


eption of occupational prestige 
dance with the differences in 
none of the main ef- 
d to be statistically 


When the Ss' perc 
was adjusted in accor 
their occupational aspiration, 
fects and interaction was foun 
significant. 


SUMMARY AND DISCUSSION 

consisted of 179 male 

students of gra 2 from a geographical- 
culturally deprived area which is lo- 

озуп as the Appalachian 

s attempted to an- 

ncerning the relationship of socio- 

de levels to the variables 

{ occupational 

d fear of failure. 


swer questions cO! 
economic levels and gra 
occupational aspiration, perception о: 
prestige, achievement motivation, an 


covariance analyses, and mul- 


aledthat socioeconomic groups 
ational aspiration, with 


Variance analyses, 
tiple comparisons reve 
differed significantly on occup 
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or without their differences in perception of occupa- 
tional prestige being controlled, that socioeconomic 
groups differed significantly on perception of occupa- 
tional prestige hierarchy without controlling for their 
differences in occupational aspiration, that socioeco- 
nomic levels of the ninth grade group differed signif- 
icantly on achievement motivation, that grade levels 
of the lower-lower socioeconomic group differed sig- 
nificantly on achievement motivation, and that there 
were no Significant differences between socioeconom- 
ic groups or between grade levels on fear of failure. 


It appears that the individual's socioeconomic sta- 
tus does have a great deal to do with occupationalas- 
piration, perception of occupational prestige, and 
achievement motivation. In order to stimulate social 
mobility in the lower socioeconomic groups, there- 
fore, it would be necessary to help improve the goal 
they set for the attainment of higher le vel occupa- 
tions, change their perception concerning the world 
in general and on the world of work and education in 
specific, and acquire a stronger urge to approach 
success. 
terest was the degree to which 
the four dependent variables investigated in the 
study, occupational aspiration, perception of occu- 
pational prestige, achievement motivation, and fear 
of failure, might be tapping the same factors. А 
factor analytic approach was, thus, used to clarify 
this point. The principal-component solution was 
used to generate а factor matrix from the 4x4 cor- 
relation matrix. The extracted factors I, п, ш, 
and IV contributed 46, 22, 20, and 12 percent of the 
total variance, respectively. These factors were 
then orthogonally rotated to optimize the contribution 
of each of the four variables to each of the four fac- 


tors. Results are shown in Table 6. 


Factor 115 characterized by а high factor loading 


of .96. This high correlation between Factor I and 
occupational aspiration together with other insignif- 
.25, .09, and -.11 would in- 


icant factor loadings of - 
dicate that this is the occupational aspiration factor. 
Factor Il, represented by а high factor loading of 


.99, clearly is the achievement motivation factor. 

It can be observed from Table 6 that, in fact, Fac- 
tor III is the fear of failure factor and Factor IV is 
the perception of occupational prestige factor. In 

conclusion, the four dependent variables tapped by 
this study in relation to socioeconomic and grade 
levels of adolescent boys appear to be uniquely dif- 


ferent variables. 


Of an empirical ini 


TABLE 6 
ROTATED FACTOR MATRIX 
Factor Factor Factor Factor 

Variable I п n ІҮ 
Occupational 

aspiration .96 -10 -.12 -.25 
Occupational 

prestige -.25 -42 12 .95 
Achievement 

motivation .09 .99 -.0 Š 
Fear of е ш 

Failure -.11 -.04 .99 ERI 


MM =. 
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FOOTNOTES 


This study was supported in part by the Office of 
Economic Opportunity as a part of the Mc- 
Dowell County Evaluation Project, Contract 
Number OEO-703 between West Virginia Uni- 
versity and the Office of Economic Opportuni- 
ty. 


The author would like to express his appreciation 
to Donald L. Thompson for his assistance, 


A method concerning analysis of variance for 
simple effects of a statistically significant in- 
teraction can be found, for example, in B. J. 
Winer's Statistical Principles in E erimental 
Design, McGraw-Hill, New York, pp. 233-238. 
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interpretive. 
рети Statements. А summary of conclusions and 
jlications for education may supplement the abstract. 
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ABSTRACT 


Some groups of preliminary year indigenous students at the University of Papua and New Guinea were Ss 
for an investigation of concept selection strategies among students from a non-Western culture. Some of the 
methods of Bruner, Goodnow, and Austin (1) were replicated. It was found that in forming conjunctive categor- 
ies the students were consistent in maintaining a definite strategy, that more students adopted a scanning strat- 


egy, and that focusers were the most successful. 


Mixed strategists attained no success. 


The abstraction of 


disjunctive concepts provided more difficulties, as it did with their Western counterparts. The Ss, especially 


focusers, did not maintain their strategies. 


Focusers and scanners on the conjunctive problems did not inter- 


Change roles on the disjunctive problems, but some of each adopted mixed strategies with the latter problems. 
Mixed strategists and focusers were more successful than scanners, though focusers either solvedall or no prob- 
lems. The Ss generally found the use of negative examples difficult. The investigation is the first part ofa 


Series of studies. 


THE RESULTS described in this paper are 
the initial ones from the first part of a longitudinal 
Study. It is felt that the results so far obtained are 
of sufficient interest to warrantdiscussionnow. The 
objectives of these experiments are to investigate 
the concept selection strategies of indigenous math- 
ematics students at the University of Papua andNew 
Guinea, to describe these strategies mathematical- 
ly, to link this mathematical formulation with a cat- 

* egory model of abstraction, and to contrast the re- 
sults of Austr: students with Papuan and New 
Guinean students. The experiments were also car- 
ried out to provide background information for a 


mathematics project. 


Bruner, Goodnow, and Austin (1) made reference 
to, the difficulty that the Harvard students encoun- 
tered with disjunctive categories. They suggested 
that this may be a legacy of their Western culture 
with its “соттоп effects have common causes” 
style of thinking, as well as a general inability to 
use negative information. In this paper, some ini- 
tial findings about strategies employed by Papuan 
and New Guinean students are presented as a source 
of some information about a less-Westernized cul- 


ture. 
RELATED RESEARCH 


Bruner and others have carried out the classic 


study of idealized concept selection strategies. Their 
Ss had to form conjunctive, disjunctive, and relation- 
al categories when presented with positive and nega- 
tive examples. In their terms, attaining a concept 
and learning a category were identical. Thetwoideal 
strategies they describe were called focusing and 
scanning. 

The pioneering study in the comparison of perfor- 
mances in a conceptual task with some ideal strategy 
was that of Whitfield (9). He was followed by Hov- 
land and Weiss (6), and like Bruner and others, found 
that negative information was not used as efficiently 
as positive information in concept attainment. 


A criticism by Henle (5) of Bruner andothers was 
that they tended to under-emphasize the role of logic 
in their investigations of reasoning processes. Pre- 
viously, Donaldson (3) had also criticized their fail- 
ure to discuss perceptional links in the use of nega- 
tive information. Bruner and others suggested that 
to avoid mistakes students distrust transformation 
of negative to positive information. 


Donaldson’s research involved repeated transfor- 
mations in both directions. His Ss found the sequence 
negative to positive and back especially difficult. The 
difficulty was in using the positive derived from the 
negative. 
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His explanation for this was twofold: the air of 
finality of positive statements anda certain mistrust 
of negative information. This latter point was ac- 
counted for by Donaldson not in terms of the risk of 
failure so much as a less rational feeling ‘‘that neg- 
ative information is not such good currency as posi- 
tive information. ” 


Campbell (2) developed the work of Bruner and 
others and Donaldson and supported another sugges- 
tion of the former, that lack of facility with indirect 
procedures may extend to a variety of cognitive ac- 
tivities other than categorization. 


All of these studies have tended to confuse two 
variables: the learning of rules when one knows the 
relevant properties іп a concept, and the learning of 
the relevance of the properties in a concept whenone 
knows the rules. Haygood and Bourne (4) showed 
that there was a difference between these two vari- 
ables. Wallace (8) summarizes Bruner's work and 
mentions two other studies that support it. 


THE PRESENT STUDY 


Attempts were made in this study to replicate cer- 
tain aspects of Bruner's work at Harvard. The two 
types of category considered here were the conjunc- 
tive and disjunctive. Eight examples of each concept 
to be attained were presented to Ss. These examples 
were either positive (some aspect of the concept) or 
negative (no aspect of the concept). Each example 
confirmed or contradicted the S's hypothesis. 


This hypothesis was formed by considering all or 
some of the attributes of the first positive example. 
Those who considered all the attributes were using 
a focusing strategy. Those who consideredonly one 
attribute at a time were using a scanning strategy. 
The others were adopting a mixedstrategy. Thetwo 
ideal strategies in forming conjunctive categories 
are schematically described in Figure 1. In the ex- 
ample in Figure 1, if the concept to be attained were 
conjunctive it would be “things containing c and б”; 
if it were disjunctive it would be *'things containing 
с, or things containing d, or things containing c and 
d." 


In abstracting the conjunctive concept, the S takes 
the common attributes between successive positive 
examples. Each negative example can be replaced 
by a complementary positive example, of which the 


FIGURE 1 


IDEAL STRATEGIES FOR CONJUNCTIVE CATE- 
GORIES 


focusing 


scanning 
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a, b, c, dare attributes ofthe firstpositive example. 


зге positive examples confirm or contradict 
а, 5, C, d, 


FIGURE 2 


DESCRIPTION OF CONJUNCTIVE AND DISJUNC- 
TIVE CATEGORIZATION 


A, B, C, D are positive examples; 
E, F, G, H are negative examples. 


, 


E implies the existence of a complementary positive 
cE, consisting of those attributes which are not in E. 


Conjunctive category is formed from: 
A or B or C or D) and (cE or cF or cG or cH). 


Disjunctive category is formed from: 
(A and B and C and D) or c (E and F and G and Н). 


attributes are those that are not in the negative €X^ 
ample. These are used as described in Figure e 
Figure 2 also shows the formation of the disjunctiV 
category where use is made of the complementary x 
set formed from all the attributes in the negative © 
amples. The descriptions in Figure 2 are based 9! 

a set theoretic analysis of categorization and are ne 
unique. For instance, c (E and G and H and I) can 
be replaced by ( cE or cG or cH or cI). 


Trial runs were made with different forms ofthe - 
experiment on some Australian undergraduate 212 
ematics students and then on some Papuan and " 
Guinean meteorology students. In the form that Жр ; 
adopted, the testing was done in a group situation W 
the cards of Bruner and others projected onto 2 
Screen. Each card was projected for 20 second: 
whereas Bruner and others had displayed them М 
10 seconds. It was felt that some allowance ha зай“ 
be made for the fact that English was the secon 
guage for many of the Ss. 


These cards could have one, two, or three оре " 
ders, with one, two, or three shapes which СО ee ` 
circles, crosses, or squares and in one of th? ce 
colors. There were eighty-one cards which 
tested in successive weeks. 


The students’ responses were recorded o? Я 
cially prepared record sheets. The group $ ү jn th? | 
required the use of several monitors to assist er? 
administration of the test. Very few difficult’ үң y 
encountered that had not been foreseen as 2 Г | 
of the trial runs. | 


THE POPULATION в 
8 

The University of Papua and New Guinea ЖЕТ, 
preliminary year which is a year between the cot 
pletion of the Papua and New Guinea School Ce?" e$, 
and the commencement of undergraduate stu ou 
It was felt that the preliminary year students " 
be the least Westernized. “ 
еһе” 0] 

The preliminary year classes had been mate 
by the University administration and three V aso" 
chosen at random for testing. There is nO Е ina 
to believe that they were not typical preli®™ 
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TABLE 1 TABLE 3 

CONJUNCTIONS STRATEGIES FOR DISJUNCTIVE CATEGORIES 
Strategy Scanning Focusing Mixed Strategy Scanning Focusing Mixed 

Number number (%) 

X? = 17.9 39 (60%) 22 (35%) 4 (5%) X? -9.9 25 (38%) 10 (16%) 30 (46%) 

р 0.001 р<0.01 

4/4 соггесі 8(21%) 16(73%) 0(0%) 4/4 correct 0 (0%) 5 (50%) 5 (17%) 


Secondary 

Schooling: 

Administration 20(31%) 10 (15%) 1 (3%) 
Mission 19 (29%) 11 (17%) 3 (5%) 
Mean Age 18 years 17 years 18 years 


8 months 11 months 6 months 


year students, especially because Kearney (7) points 
out that they are more homogeneous than their West- 
ern counterparts. 


THE RESULTS 


Table 1 shows the results for the conjunctive con- 
cepts. The first feature to be noticedis the high per- 
centage (95%) of students who adopt one strategy or 
the other consistently in forming the conjunctive cat- 
egory. More go in for scanning than focusing in con- 
trast to the Harvard group. The next point that 
emerges is the greater success of the focusers in 
Solving the problems. They are also younger on the 
average and there are no focusers over the age of 
19, though it is not possible to infer anything from 
this evidence. If we try to determine whether the 
expected frequencies are equal for all categories we 
have to accept Ну (the frequencies are not equal) 
and reject Hg (the frequencies are equal and X? is 
distributed as chi-square with 2 d. f. , œ = . 05), 


Some attempt was made to see if there was any 
relation between the Strategy employed and the dis- 
trict in which the student lived as a child but no 


TABLE 2 


CONJUNCTIVE STRATEGIES COMPARED WITH 
DISTRICT OF PRIMARY SCHOOL EDUCATION 
AND WITH SUBJECTS' PROPOSED UNDERGRAD- 
UATE COURSES 
=———————Є——Є———— 


2 


Strategy Scanning Focusing Mixed x 
Papua T 8 d 5.8 
New Guinea 

Mainland 10 6 1 6.8 
New Guinea 

Islands 22 8 2 21.2* 
_ ——H———— 20. Зе 
Arts-Law- 

Education 12 8 2 7.3 
Science- 

Dentistry- 

Medicine 27 14 2 22.4* 
SS 
*p = 0.001 


0/4 correct 10 (40%) 5 (50%) 20 (67%) 


definite picture emerged. A crude summary is pro- 
vided in Table 2. The main feature of interest in this 
part was that all the East New Britain students were 
scanners. It is intended to investigate whether this 
is related to the fact that in a recent trial of mathe- 
matics learning materials for grade 7 in Papua and 
New Guinea, the only students to encounter serious 
difficulties in set theory and logic were those at 
schools in East New Britain. Little can be learned 
at this stage from the rest of Table 2 which compares 
the strategies and the proposed undergraduate cours- 
es of the Ss. Chi-square tests for individual classi- 
fications are also displayed in Table 2. Cross- 
classifications gave X? = 4.7 (4 d. f. ) and X? = 2.7 
(2 d.f. ) for the districts and courses respectively, 
neither of which were significant. 


Table 3 shows the results on the disjunctive prob- 
lems. Like their Western counterparts, the Ssfound 
much more difficulty with these. The chi-square 
test showed a significant difference. 


Many of the focusers and scanners of the conjunc- 
tive problems adopted mixed strategies when con- 
Íronted with the use of the negative examples in the 
disjunctive problems. Focusers on the conjunctive 
problems did not become scanners, or vice versa. 
Focusers tended to get all or none of the problems 
correct, and were, with the mixed strategists, more 
successful than the scanners. Conjunctive focusers 
were no more successful as disjunctive mixed strat- 
egists than conjunctive scanners. 


CONCLUDING DISCUSSION 


It should be borne in mind that the Ss werenotun- 
touched by Western culture. Although the materials 
of the experiment were virtually culture-free, it 
would not be easy to carry out similar testingon vil- 
lagers in remote areas. 


No explanation can be offered yetinculturalterms 
for the remarkably high percentage of adherents to 
a definite strategy on the conjunctive items, or the 
relative success of the focusers, or the preference 
for scanning rather than focusing. In other respects 
the Ss did not seem to differ much from their West- 
ern counterparts, though later testing of the same 
Ss and more refined techniques for testing new Ss 
may enlarge any differences, 
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ABSTRACT 


The development of an in: 
tion is described. The instrum 


means of assessing attitudes. Teachers were asked tor 


ples of cl: oom procedures. The examples were designed to be appl 
ing i ҮТ : t was high (. 93); content validity was demonstrated f or 


ized reading instruction. Reliability of the instrumen 


classroom examples and adjectives used in the instrument. › 
tiveness of the instrument in discriminating among teachers’ attitudes towar 


are also reported. 


THE IMPORTANCE of individualizing read- 
ing instruction —also known as ** diagnostic teaching’’ 
or “ individually guided instruction ''— has been re- 
cently emphasized. Regardless of terms, the notion 
isthat instruction should be based on pre-assessment 
of children's individual strengths and weaknesses. In 
order for this type of instruction to become a class- 
room reality, teachers must focus оп the needs of 
individuals rather than the group. This orientation 
may require a basic change in teachers' attitudes as 
the need for continuous assessment and modificatiom 
of each individual's instructional program based on 
the assessment take precedence over getting the 
classroom group through the pages in a basal reader 
and accompanying workbook. 


One attempt to assess teachers' predisposition 
toward various instructional approaches in reading 
has come to the author's attention. The San Diego 
Teacher Inventory of Approaches to the Teaching of 
Reading (8) measures teachers' agreement with the 
assumptions of three instructional approaches—basic, 
individualized, and language experience. The defi- 
nition of the individualized approach, however, isthe 
classic one advocated by Veatch (9) and others, in- 
volving the principles of seeking, selí-selection, and 


strument for measuring teachers' attitudes toward individualizing reading instruc- 
ent was constructed in the semantic differential format chosen as an indirect 
espond anonymously on adj ective scales to eleven exam- 


ications of the assumptions of individual- 


Two validation studies which established the effec- 
d individualizing reading instruction 


self-pacing. The San Diego Inventory, consequently, 
does not measure teachers’ attitudes toward ће most 
recent concept of individualization—that of planning 
each child’s reading instructional program based on 
his pre-assessed needs—which can occur within any 
instructional approach, with any group size, using 
any materials. 


An instrument was thus constructed to as- 
sess teachers’ attitudes toward individualizing in- 
struction in reading. In this paper the development 
of the instrument and two validation studies are de- 
scribed. 


DEVELOPMENT OF THE INSTRUMENT 


A problem in attitude assessment has been that 
respondents tend to give the answers they think are 
expected, rather than to respond as they actually be- 
lieve. Jackson and Messick (4) cautioned that in- 
direct, disguised techniques ” are sometimes nec- 
essary to obtain a valid measurement of an attitude 
Weschler and Bernberg (10:225 ), furthermore ` 
stated that ‹ to a certain extent the value of a given 
technique may depend upon the manner in which itis 
able to disguise its true purpose and can be adjusted 
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to fit into a variety of different situations. ” 


An indirect method of assessing attitudes, the 
Reading Teacher Survey, was accordingly chosen for 
the form of the attitude inventory. It was an adap- 
tation of the semantic differential. 


Remmers (7), after summarizing several stud- 
ies that employed the semantic differential in assess- 
ing attitudes for various purposes, cautioned that a 
pias due to response-sets may be operating. Inother 
words, the order of presentation of the concepts tobe 
evaluated may influence the responses of the S. More 
recently, however, Kane(5), after analyzing data 
from a semantic differential instrument which includ- 
ed various combinations for ordering items, showed 
that item order is not a significant influence and that 
an experimenter need not worry about proximity ег- 
rors. 


Osgood, Suci, and Tannenbaum have discussed 
the flexibility of the semantic differential: 


Although we often refer to the semantic 
differential as if it were some kind of 

“ test,” having some definite set of items 
and a specific score, this is not the case. 
To the contrary, it is a very general way 
of getting at a certain type of information, 
a highly generalizable technique of mea- 
surement which must be adapted to the re- 
quirement of each research problem to 
which it is applied. There are no standard 
concepts and no standard scales; rather, 
the concepts and scales used in a particu- 
lar study depend upon the purposes of the 
research. (6:76) 


Two adaptations of the basic semantic differen- 
tial instrument, as described by Osgood and others 
(6), were made here. First, analysis of the three 
factors found by Osgood and others—evaluation, po- 
tency, and activity—was not undertaken since mea- 
surement of a unitary concept of attitude toward in- 
dividualizing reading instruction seemed more desir- 
able. Second, an agree-disagree scale was included 
to determine whether Ss would tend to respond more 
positively to it than to the other scales which consis- 
ted of adjectives. This notion was supported by the 
data. 


Pilot Studies 


The form of the Reading TeacherSurvey was 
changed several times before the final version was 
arrived at. А brief summary of changes is given 
here; a more detailed description may be found else- 
where (1). 


The first version consisted of twelve statements 
which summarized the basic tenets of individualizing 
instruction in reading. Subjects were asked to r e- 
spond to the statements on seven adjectival scales and 
agree-disagree scale, each of which had seven posi- 
tions ranging from the negative extreme to the posi- 
tive extreme. The adjectives for the scales were 
picked from those used in the literature to describe 
individualized reading instruction. This version,how- 
ever, was not retained after a pilot test because 
teachers all tended to mark the positive extreme, and, 


thus, the instrument did not discriminate among in- 
dividuals. It appeared that teachers were respond- 
ing positively tothe theories of individualization with- 
out thinking of the classroom ramifications. 


The next version presented twelve examples of 
classroom situations, illustrating procedures t hat 
would grow out of the assumptions of individualized 
instruction. One of the twelve statements or exam- 
ples, representing a viewpoint opposed to individu- 
alization, was inserted to break a set toward posi- 
tive responses ( see Appendix, statement 4). 


Teachers, who did not sign their names, were 
asked to consider the feasibility of applying each of 
the twelve examples in their classrooms. They were 
instructed to record their responses on the eight 
rating scales—the same ones used in the previous 
version—íollowing each example. This version, af- 
ter pilot testing and subsequent minor revisions, WaS 
used in a study of attitude change (1). 


Revised Version 


Since the instrument did reflect a change 1? pe 
teachers' attitudes due to an intervening experimen 
al treatment, further revisions were made on th x 
basis of item analyses. Using the Generalized Ite! 
Analysis Program (2), eachof the ninety-six де 
(i.e., each of eight scales under each of the twe oe 
examples) was analyzed both in terms of the corr 
lation with the subtest score (the subtest being ae 
classroom example) and in terms of the correlati 
with the total test score. 

t 

Extensive revisions were made. The instrumen 
was shortened to sixty-one items or scales. e 
classroom example was eliminated and one was 1a 
written, making a final total of eleven. Partic jov 
scales that did not discriminate between high а" ne 
total scores were also omitted; therefore, only а} 
scales that had the highest correlations with tO yor” 
score were used under each example in the thir eo” 
sion, One adjective scale that proved to be ine ple 
tive was omitted, leaving a total of seven poss 
scales. 

0 

Respondents were asked to rate, onthe scale? Py 
vided, the feasibility of applying each classr0° gro" 
ample. The scales had seven positions rang ПЕ ше 
the negative extreme to the positive extreme? ont 
middle position could be used when the гезро thé 
felt neutral or did not know how to answer. tate” 
exception of asking teachers to consider 6200” {ре 
ment in terms of their classroom experience" ей by 
instructions were modeled after those sugges 
Osgood and others (6:82-84). ch 

ей! 

By summing the point values ог responses od 441 
scale, the test yielded а total possible score ar ded 
points. (The most positive response WaS S соге 
7 points and the most negative response were ё 
as 1 point; the five positions between the en wt 
were awarded 2 to 6 points. ) The instrume tary v 
then given to thirty-one experienced ele™ submi, 
school teachers; the data were ават геї 
ей іо item analysis, but only very ИШ ар wee 
were made. The statements and scales i» ege? 
used inthe Revised Version are P 
edin the Appendix. 
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Validity and Reliability 


Content validity was demonstrated for the class- 
room examples and adjective scales. Three profes- 
sors who teach courses in reading at the University of 
Wisconsin judged the classroom examples to be re- 
levant to measuring attitudes toward individualizing 
reading instruction. They, furthermore, judged that 
the classroom examples represented application of 
three assumptions of individualized instruction: 


l. The teacher should do diagnostic test- 
ing to determine the specific Strengths 
and weaknesses of each child, 


2. Children should receive instruction 
in the skills they need and move at 
their own pace. 


3. Children should be given materials 
appropriate to their abilities and in- 
terests. 


The same professors also judged the scales to 
be relevant to measuring attitudes toward individual- 
izing reading instruction. The scales had further va- 
lidity in that the adjectives were chosen from the lit- 
erature describing individualized reading instruction. 


The estimate of reliability or internal consisten- 
су (Hoyt reliability coefficient) of the Revised Ver- 
Sion was . 93 based on the data gathered from the 
thirty-one experienced elementary school teachers. 
Green (3) has stated that a high reliability coefficient 
usually indicates that the items are homogeneous and 
the scales unidimensional. Since the reliability co- 
efficient for the Revised Version was high and since, 
therefore, the items were highly intercorrelated, the 
instrument was apparently measuring one factor as 
was intended instead of the three factors found by Os- 
good and others (6). Presumably, this unitary fac- 
tor was teachers’ attitudes toward individualizing 
reading instruction, 


VALIDATION STUDIES 


Two types of questions were asked in seeking ex- 
perimental validation of the instrument: (1) Could 
the instrument successfully discriminate between the 
attitudes of teachers who were systematically individ- 
ualizing reading instruction and those of teachers who 
were not? (2) Could it measure a change in teach- 
ers' attitudes when instruction in a school had changed 
from conventional to individualized? 


Study 1 


The instrument was given to the teachers in two 
types of schools. The first type (Type 1) had suc- 
cessfully implemented an experimental systemfor 
individualizing reading instruction at least a year 
prior to the study. Teachers in Type 1 schools were 
systematically assessing pupil needs and planning in- 
struction accordingly. In the second type of school 
(Type 2) there had been no known emphasis on indi- 
vidualizing reading instruction. The hypothesis was 
that scores on the attitude inventory would be signif- 
icantly higher in Type 1 schools than in Type 2 schools. 


The Reading Teacher Survey, Revised Version, 


7 


was administered in the fall of 1969 in two Type 1 
Schools and five Type 2 schools in small or middle 
sized Wisconsin cities. All classroom teachers of 
grades 1-6 took the inventory; special teachers— 
such as reading teachers—were not included. 


Analyses and Results. A t-test was performed 


on the data, using an estimate of the variance of the 

means. The following formula was used for pooling 
the variance of the means of Type 1 schools withthat 
of Type 2 schools: 


ёбу)? #2 6-3)? 


mean of scores in Type 1 School No. 1 


хр = 

X, = mean of scores іп Туре 1 School No. 2 

У = grand mean of scores іп Type 2 schools 

5 - means of scores in Type 2 schools 

v = degrees of freedom (number of schools mi- 


nus 2) 


The ‚05 level for a two-tailed t-test was desig- 
nated as the level of significance for testing the dif- 
ference between means. The following formula was 
used in figuring the t value: 

pow сл 
s 

Table 1 presents the means and standard devia- 

tions of scores for teachers in the seven schools. 


TABLE 1 


MEANS AND STANDARD DEVIATIONS FOR 
INVENTORY SCORES IN TYPE 1 AND 2 SCHOOLS 


Standard 
Schools Mean Deviation N 
Type 1 Schools: 
No. 1 (x) 388.67 41.13 21 
No. 2 (х,) 365.38 2415 8 
Grand Mean (x) 377.02 
Type 2 Schools: 
No. 1 (ү) 362.33 48.32 21 
Мо. 2 (y) 338.10 32.42 21 
No. 3 (у) 331.83 33.29 6 
No. 4(у) 328.20 41.49 15 
No. 5 GA 327.50 48.07 10 


Grand Mean (y) 337.59 
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The obtained t value (t=2. 65) from testing the dif- 
ferences between means of inventory scores in Type 
1 and Type 2 schools was significant at the . 05 level 
for a two-tailed test. It can also be noted from Table 
1 that the ranges in scores of the two typesof schools 
did not overlap. 


Study 2 


School No. 2 of the Type 2 schools adopted asys- 
tem for individualizing reading instruction during the 
1969-70 school year. Since the Reading Teacher Sur- 
vey, Revised Version, had been administered the fall 
before in-service training on individualization was 
given, the inventory was readministered at the end 
of the school year to determine if a change in atti- 
tudes had occurred after teachers had been system- 
atically individualizing reading instruction for one 
year. As in Study 1, the instrument was taken by all 
classroom teachers in grades 1-6; special teachers 
were not included. 


Analyses and Results. Although teachers didnot 
sign their names the inventories taken by each in the 
fall and spring were paired together by coding the in- 
ventories. The means and standard deviations of in- 
ventory scores at each administration time are pre- 
sented in Table 2. (The data in Table 2 for the fall 
administration are slightly different from the figures 
given in Table 1 for the same school; several teach- 
ers were omitted from the sample in Study 2 because 
they resigned during the school year. ) 


A t-test for matched pairs was performed on the 
data. The obtained t value (t= 4. 09) was significant 
beyond the . 001 level for a two-tailed test. 


Discussion 


Of the two sorts of evidence obtained, the first 
appears to offer conclusive evidence of validity. 
Scores on the Reading Teacher Survey, RevisedVer- 
sion, were significantly higher in schools where in- 
struction proceeded from the pre-assessed needs of 
individual pupils than in schools where individualiza- 
tion was not systematically practiced. The second 
sort of evidence, scores obtained by the same teach- 
ers being significantly higher after they had adopted 
a system for individualizing reading instruction, is 
also supportive if one may assume that control groups 
would not register similar gains. 


CONCLUSIONS AND IMPLICATIONS 


The Reading Teacher Survey, Revised Version, 
was shown in two studies to be effective in discrim- 


TABLE 2 


MEANS AND STANDARD DEVIATIONS OF 
INVENTORY SCORES IN TYPE 2 SCHOOL NO. 2 


Administration Standard 

Time Mean Deviation N 
Fall 336.35 32.85 11 
Spring 361.12 38.22 17 


inating among teachers’ attitudes toward individual- 
izing reading instruction. Furthermore, the instru- 
ment was demonstrated to have a high reliability and 
content validity. It is easily scored, with a highto- 
tal score indicating a positive attitude toward indi- 
vidualizing reading instruction. It is quickly admin- 
istered, taking no longer than 20 minutes; it may be 
administered to a group with the directions being 
read aloud, or it may be given to teachers to fill out 
independently. 


The instrument may be used in a variety of sit- 
uations. It would be especially useful as an eval- 
uative tool in studies of individualized approaches to 
reading instruction. It may also be used to evaluate 
the effects of training or some other treatment onthe 
attitudes of teachers. Frequently, teacher opinions 
and attitudes are solicited only by informal question- 
naires; the Reading Teacher Survey, Revised Ver- 
sion, provides a more objective means of assessing 
such attitudes. 


FOOTNOTE 


l. The material reported herein was prepared with 
the support of the Wisconsin Research and ре” 
velopment Center for Cognitive Learning, SUD" 
ported in part as a research and development 
center by funds from the United States Office 
of Education, Department of Health, Educa 
tion, and Welfare. Theopinions expressed here 
in do not necessarily reflect the position ОГ 1 
policy of the Office of Education and no offici? 
endorsement by the Office of Education shoul 
be inferred. Center No. C-03/ Contract OF 
5-10- 154. 
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APPENDIX 


Statements and Scales in the Reading Teacher Sur- 
vey, Revised Version 


EXAMPLE: 


А. Lucy, Larry, Joe and Dick need work on recog- 
nizing final consonant sounds in words. It is 
feasible for the teacher to work with these chil- 
dren in a small group until they have mastered 


this skill. 

agree disagree 
Ineffective | : — effective 
challenging = unchallenging 

Isorganized | organized 
Fractical impractical 
cals "unfair 
efficient | : efficient 


Mark each scale in terms of the effect upon you, the 
“басһег, if this example of instruction were applied 


in 
Your classroom. 


Peteand Gary are among the best readers in their 
third-grade class. It is feasible for the teacher 
to know that Pete has trouble reading social stud- 
‘es books while Gary who has no trouble withfac- 
ual material cannot understand non-literal ma- 


terial, 
ree : : : : i : disagree 
ci tectiye effective 


di allenging unchallenging 


i = 
же Banizeq organized 
fai tical : E. а 
"апае —: — efficient 

2. 


Iti : 
is 18 Possible for the teacher to know that Dennis 


Stan in picking out the main idea of a para~ 
Yowe ut good at recognizing all consonant and 
el sounds, 


9 
agree __ disagree 
ineffective effective 
disorganized | T organized 
practical __ impractical 
fair __ unfair 
inefficient efficient 


3. Although Ruth is working in more than one set 
of materials to learn the short a sound, it is 
possible for the teacher to know which skill she 
should be taught next. 


agree disagree 
ineffective effective 
challenging unchallenging 
disorganized organized 
practical impractical 


inefficient efficient 


4. Itis feasible for a second-grade teacher touse 
the same 2-1 basal reader with the whole class. 


agree disagree 
ineffective 7” effective 
challenging unchallenging 
disorganized organized 
inefficient __ efficient 


5. К can be expected that a second-grade teacher 
will know when and how to teach outlining skills 
to Gary who reads far above grade level. 


challenging :  unchallenging 
disorganized organized 
practical impractical 
fair unfair 
inefficient efficient 


6. Itis feasible for Mary Lou, who has not mas- 
tered initial consonant sounds, to continue work 
on them although the rest of the children have 
mastered this skill and have moved on to new 


material. 
agree disagree 4 
challenging unchallenging 
practical impractical 
fair unfair 


7, It is feasible in a second-grade classroom to 
provide Pete with fourth-grade materials which 
he can read and to give Peggy pre-primer ma- 
terial which is appropriate for her. 


disagree 

е 4 

ка unc ranengtig 
i ized organize! 

зер ТОГЫ oe: impractical 

fair unfair 

inefficient efficient 


jorie, David, Howard, Dorothy, and sever- 

М Mp eem are working together in a smallgroup 
on recognizing certain consonant blends. It is 
possible for the teacher to assess at almost 
every group meeting which children have mas- 
tered this skill and to modify teaching accord- 


ingly. 


agree жек m disagree 
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ineffective effective for mastering these sounds in words. 
challenging unchallenging | 
disorganized organized agree i i : i $ ro. disagree 
practical —— impractical ineffective effective | 
fair unfair challenging — unchallenging 
inefficient efficient disorganized | organized 
practical impractical 
fair unfair 
inefficient efficient 
9. Jim does not seem to have much interest in read- 
ing in the basal reader. The teacher can eee 11. Gary has mastered all the work taught to the 
tively yee кришна н а анн class very quickly. It is feasible to allow him 
ing skills. to start working on vowel digraphs even though 
А А ized the rest of the class still is working on conso- 
Бан Ет касы ыы nant blends and short vowel sounds. 
air 
inefficient efficient TE disagree 
i i ffective 
10. Jim, Dennis, Gary, Ruth, and Pete all need to енесш аай 
work оп the vowel dipthongs oi апа оу. It is fea- aes Ы iis 
sible to meet with this group onceor several : че ificient 
times, depending on the length of time needed inefficient е 
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ABSTRACT 


The effects of six classroom motivational treatments on 112 fifth and sixth grade students were measured 


using a difference score on a substitution task. 


Individual goal setting and competitive treatments, under reward 


i ep ed conditions, are analyzed by means of planned comparisons. Results indicate a significant inter- 
тра Suggests caution against an oversimplified interpretation of main effects, A S's performance in a 
BRE dus treatment is shown to be dependent upon three factors: his initial ability relative to that of his class- 
tion Th e presence or absence of a reward; the homogeneous or heterogeneous nature of the group in competi- 
Ше е evaluation of this significant 3-way interaction ( i. e., ability x reward x grouping) and comparisons of 
S suggest several tentative hypotheses and raise highly relevant questions regarding the use of classroom 


motivati 4 
otivational techniques which are competitive in nature. 


such ОМРЕТІТТОМ has been examined in the light of 
d eee as sociometric status (11), rewards 
a task anes (2,5, 9), duration, and repetition of 
елеу (3. 14,15), and inter- and intragroup depen- 
ҚЫ 10). Studies have been conducted with a 
Se at of Ss, a wide range of tasks, diversity in the 
hattene Оз and incentives, and numerous combi- 
tency а of treatments. Yet there is a lack of consis- 
of com mong the findings, and directives for the use 
ampl Petition in education are very limited. For ex- 
ciency eile one study demonstrates that the effi- 
Sistent) work under the competitive condition is con- 
Cooper t and significantly higher than under the 
Perform We situation (4), another study reports that 
i thawte іп a cooperative situation is more effi- 
ө in its contrasting competitive treatment. 
mati De Vault (12) emphasize that a lack of 
Б eratio, variation in treatments is a major flaw in 
Single D 9n and competition research. Frequently a 
е nonca Petitive treatment is compared with a sin- 
Or y үрреннуе treatment and the dichotomy allows 
tmited generalization. 


vi 


The q; 
discrepancies in definition of terms а180 


complicate the interpretations and comparisons that 
might be made among studies. A situation defined 
as competitive-group versus cooperative-group by 
one E is identified as group-competition versus in- 
dividual competition by another (7). For one study, 
competition is restrictively defined to include only 
performance in which the success of one member 
hinders the achievement of other members 
(3); for another it inclues actively preventing com- 
petitors from reaching a goal as well as trying to 
achieve it for oneself (4). In a much earlier study 
competition is defined as an effort manifested by one 
when he is influenced by the “ desire to excel’’(6). 


In examining the effects of competitive techniques 
in education, it is essential to clearly specify an op- 
erational definition and to systematically study treat- 
ment variations, Thecompetition in our present ed- 
ucational system takes such various forms as interim 
reports, contests, class rankings, and scholarship 
awards. At times recognition is more formal than 
at others; competitors may be homogeneous or het- 
erogeneous; the competitive nature of the task may 
be clearly defined or just generally assumed. In any 
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case, a student’s performance is usually compared 
with that of his classmates, with that of a local or 
national norm, or, through the use of self-progress 
charts, it is compared with his previous performance, 
in which case the student is seen as his own‘ rival” 
or competitor. 


This study examines the effects of several com- 
petitive treatments used for classroom motivation. 
Competition is defined as a situation in which Ss are 
encouraged to surpass each other but are unable to 
directly affect the absolute score of their competi- 
tors. The treatments are so designed that three im- 
portant variables can be studied simultaneously: re- 
ward, grouping technique, andability of Ss. The 
difference score resulting from a 1 1/2 minute pre- 
and post-measure on a substitution task is the de- 
pendent variable. 


DESIGN 


A 4x7 randomized block design, consisting of 
seven treatments and four ability levels, was used. 
In addition to a control, there were six motivational 
treatments which varied in type of grouping (i.e. , 
homogeneous grouping, heterogeneous grouping, no 
grouping) and reward (і. e., absence and presence), 


The blocks correspond to four levels of ability 
as measured by a pretest and are identified as High 
(H), High Average (HA), Low Average (LA), and 
Low(L). The dependent variable measured was the 
difference between the pre- and postteston a sub- 
stitution task. Each of the twenty-eight cells inthis 
4x7 layout contained four observations, yielding six- 
teen Ss per treatment for a total of 112 observations, 


The treatments are identified as follows: 


Individual Goal Setting with Reward (1+). Each S's 
posttest paper had a red line indicating how far he 
had worked on the pretest and a red-circled item 
indicating а 10- point increase over his initial score, 
The circled item was the S's ** motivational goal, ” 
Each S was promised a reward for achieving this 
goal. 


Individual Goal Setting without Reward ( I-). This 
condition was the same as I« except that there was 
no promise of a reward for achievement, 


Homogeneous-Group Competition with Reward (Ho+) 
The S was encouraged to surpass three fellow stu- 
dents all of whom had pretest scores similar to his 
(i.e. , within a 3-point range. ) Ss were promiseda 
reward for achieving the highest score ina subgroup, 


Homogeneous-Group Competition without Reward 
(Ho-). This condition was the same as Ho+ except 
that there was no promise of a reward for achieve- 
ment. 


Heterogeneous-Group Competition with Reward (H 
e+). The S was encouraged to surpass three fellow 
students, all of whose pretest scores were quite dif- 
ferent and represented a possible range of as much 
as 25 points. Ss were promisedareward for 

achieving the highest score in a subgroup. 


Heterogeneous-Group Competition without Reward 


(He-) This condition was the same as He+ except 
that there was no promise of a reward for achieve- 
ment. 


Control (C). The S worked the posttest as a simple 
repetition of the pretest. 


SAMPLE AND MEASURES 


The Ss were 112 students from the fifth and 
sixth grade team at Empire Elementary School in 
Freeport, Illinois, The IQ obtained from the school 
records ranged from 72 to 133; there were seventy- 
one boys and forty-one girls. 


The task used to measure motivational effects of 
treatments was a digit-letter task resembling the dig- 
it-symbol section of the Wechsler Intelligence Test. 
The task consisted of associating one of six alphabet 
characters with a 2-digit number as indicated by à 
key, and reproducing the correct letter in each blank 
box according to a number printed directly above it. 
Ss were instructed to work acrosseach row, moving 
from left to right beginning with the top row and work- 
ing each in the order in which it appeared. They were 
informed of the time allowance of 1 1/2 minutes and 
were permitted to print or write the letter in capital 
or smallíorm, A task consisted of ninety possible 
responses. 


PROCEDURES 


Seven classrooms were used as experimental 127 
tions. One E administered all pre- and posttests; 85 
remained in their assigned rooms occupied with in- 
dependent studies while the E moved about testing one 
group atatime. With each group directions were 
read, a chalkboard demonstration was given, ап 
questions were answered, 


t 
In both the pre- and posttest all students prese. 
were allowed to participate, although only 112 S8 V 
actually used for the final analysis. 


For the pretest the E claimed an “ interest i" 
knowing how well fifth and sixth grade students per % 
formed оп a substitution task.’ Ss were rank orde 


: Р г 
ей on the basis of this pretest and stratified into!o™ 


TABLE 1 


PLANNED COMPARISONS 


Comparison Treatments 


C 1+ I- Ho. Ho- Не, He 
H 1 1 - A 
2 2 2 4 44 д. ж“ 
3 1а 1 -1 Qd 
4 6 2 -1 41 - д” 
5 1 -1 -1 1 
6 2 xo 1 =i } 


CLIFFORD 13 
TABLE 2 
MEAN DIFFERENCE SCORES FOR TREATMENTS BY BLOCKS 


Blocks Treatments 

с ї+ I- Ho+ Ho- He+ He- 
H -2, 16 1.25 1.15 „15 1.50 - 1.25 1.25 
НА “1.15 0. 00 50 7.25 -3.50 -1.25 1.25 
LA .50 1,25 -2, 15 4. 50 4.00 2.50 .25 
L 2.00 2.00 6.00 6.00 -1.75 6. 00 1. 50 
MEANS -0. 05 1.13 1.38 4. 63 .06 1. 50 1.06 


equal-sized blocks corresponding to four ability lev- Setting treatments? 


els (i.e., High, High Average, Low Average, Low). 
Twenty-eight Ss were then randomly selected from 
each of the four ability levels and randomly assigned 
to one of the seven treatments so that in each treat- 
ment there were four Ss from each ability level. 


3.Is mean performance in the reward treatments 
equal to that of the non-reward treatments? 


4.Is mean performance in the six motivational 
treatments ( both Competitive and Individual Goal 
Setting) equal to the mean performance of the 


The second task was administered 10 days later 
nonmotivational or control treatment? 


under the treatment conditions identified above. For 
the administration of the posttest each of the seven 
treatment groups was assigned to oneof the seven 
available stations, Inthefour competitive conditions 
Subgrouping was necessary in order to create the 
homogeneous and heterogeneous groups required for 
the treatments, Thus, the 16 Ss who were assigned 
to each competitive treatment were further divided 
into four small groups: for the homogeneous treat- А 
ments four members from the same ability level 6. Is the effect of reward the same in пе Compe- 
formed a subgroup; for the heterogeneous treatments titive and Noncompetitive treatments? 

One member from each ability level was randomly 


The last two comparisons were concerned with 
reward by treatment interactions: 


5. Is the effect of rewardthe same in the Homoge- 
neous Competition and Heterogeneous Compe- 
tition treatments? 


assigned to a subgroup. АП Ss in competitive treat- RESULTS 

ments were told against whom they were competing В 

апа also told the pretest score of each competitor. The mean difference score for еде һ of ole 

In the reward treatments Ss were promised candy for twenty-eight cells is shown in Table 2. Figure 
presents treatment means across blocks. 


Successful performance. 


ANA The first planned comparison between homoge- 
номн neous and heterogeneous competition resulted in 


_ Because the E was interested in particular com- 
Parisons among the treatments, the data wer e an- 
alyzed by planned orthogonal comparisons ( 8, 16 ). FIGURE 1 
A total of twenty-four comparisons were used. MEAN DIFFERENCE SCORES BY TREATMENTS 
Six of them were concerned with the differences 


among treatments; the remaining eighteen were or- " 
onal polynomial comparisons concerned with 
lock (ability level) by treatment interactions, Table TT 
Summarizes the six basic planned orthogonal com- $ 
Pàrisong among treatments. а 4 


Тһе first four were designed to answer the ma- 


2 
E - 


Jor questions with which this experiment was con- E 
erned: E. 
1 a 
* Is mean performance in the Homogeneous Com- б 

Petition treatments equal to that of the Hetero- 
8eneous Competition treatments? ы 

É iti - Ho- He* He- 
Is mean performance in the Competitive treat- C I+ 1 Hot о е е 

TREATMENT 


Ments equal to that of the Individual Goal 
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FIGURE 2 


COMPETITIVE GROUPING BY REWARD 
INTERACTION ACROSS ABILITY LEVELS 


DIFFERENCE SCORE 


REWARD FACTOR 


( F (1, 84) = 1.04; ps, 31). The second planned com- 
parison between Individual Goal Setting and Compet- 
itive treatments resulted in ( F (1,84) 2,39; p = 

. 54). Comparison three concerning reward treat- 
ments versus non-reward treatments reached the. 06 
level, (Е (1,84) = 2.46). The fourth comparison 
between control and motivational treatments also re- 
sulted ina p .06, (Е (1, 84) = 3.56). Comparison 
five, used to test interaction among the four compet- 
itive treatments across blocks, was found to be sig- 
nificant at the conventional . 05 level: (Ho«, He-, 
versus Ho-, He+)resulted in ( F(1, 84) = 3. 91; p= 
.05). The last comparison, designed to test inter- 
action between the competitive and noncompetitive 
treatments under the reward and non-reward condi- 
tions, resulted in ( F (1,84) = 2, 32; р= , 13), 


Figure 2 represents the interaction betweenre- 
ward and competitive grouping which proved to be 
significant. Subjects grouped homogeneously re- 
sponded noticeably better than Ss grouped heteroge- 
neously in the reward condition; the reverse was true 
in the non-reward condition, 


Only one of the eighteen orthogonal polynomial 
tests concerning the interaction of the Six basic treat- 
ment contrasts and blocks ( i. е, ‚ ability levels )was 
significant; this test resulted in (F(1,84) = 6, 21;р 
=.01). It indicated that the difference between the 
four ability levels in the effect of reward in homoge- 
neous and heterogeneous competition is significant 
and follows a cubic trend, 


Figure 3 shows the reward by grouping inter- 
action for each of the four ability levels. The re- 
sults represented in Figures 2 and 3 indicate that in 
predicting treatment performance of fifth and sixth 


grade students ina competitive situation, three vari- 
ables should be considered: presence or absence ofa 

reward, S’s ability in relation to classmates’ ability, 

and the homogeneous or heterogeneous nature of com- 
petitors. 


DISCUSSION 


The results of this study clearly indicate that di- 
chotomies such as competition versus noncompetition, 
reward versus non-reward and goal-setting versus no 
goal-setting reflect naive simplifications of the group 
motivation problem in an educational setting. Not 
one of the first four planned comparisons used totest 
such global distinctions was significant, The use of 
systematic variation in competitive treatments, as 
suggested by Phillips and De Vault (12), has definite- 
ly emphasized the complexity of the classroom moti- 
vation problem. 


It seems relatively safe to conclude that, ingen- 
eral, Homogeneous Competition with reward is the 
most effective of these seven treatments when used 
ina classroom situation, This is consistent with much 
of the competition research as well as literature on 
the use of rewards and incentives. It is likewiserea- 
Sonable to conclude, on the basis of this study, that 
the use of reward in competitive conditions has very 
different effects dependent upon the homogeneity Or 
heterogeneity of competitors. 


If one assumes that the procedures in this study 
Were successful in making Ss aware of the similarity 
or dissimilarity of their competitors, and if one als 
assumes that the use of a material reward was рер sr 
Ceived as assurance of public recognition for succe? 
ful performance, one might speculate that the inter 
action was a result of Ss’ discriminating between A 
socially acceptable and a socially unacceptable vic = 
ry. Thus, while a S is justified in striving for 2 ng 
award which symbolizes superior performance amo 
equals, it is far less Socially acceptable to seek Де 
Cognition when competitiors are poorly matched 0 М 
ability—this is particularly true for those who hav 
a marked advantage, 


r 

Speculation on each of the interaction patterns fo 
the individual ability levels requires much gre? ch 
caution; as few as four observations determine Ad 18, 
mean in the results diagramed іп Figure 3, Ther ns 
however, one characteristic of the individual pato, 
that seems relevant and logically consistent wit АРШУ 
“ social propriety” speculation: the two higher а! 0- 
levels show greater resistence to reward in hete” er 
geneous competition than do the two 19 
ability levels, 


arity lev 

Comparing the Performance of the four ability 19. 
els under homogeneous competition, reveals ЕЁ 
striking result: the effect of reward is much orem the 
in the Low and High Average ability levels than іп. 
Low Average and High ability levels, Any inter ed? 
‘ation offered on the basis of this limited data is 215 
mittedly highly Speculative. However, thesere ni^ 
raise a question of whether success and/or хепе ир“ 
tion of success may have relative value depende 
on subgroup status. 


ive BY? 
In addition to the speculations and tentat f 
potheses which can be generated from the resu 
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FIGURE 3 
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COMPETITIVE GROUPING BY REWARD INTERACTION BY ABILITY LEVELS 


ABILITY LEVEL 


this study, several related questions can be raised. 
Four which seem to have special relevance for ed- 
Ucational motivation are: 


1. Is the reward in itself the significant factor or 
the class recognition implied in obtaining the 
reward? 


2. Do students prefer clearly defined competitive 
tasks over classroom activities in which com- 
petition is assumed but the parameters remain 
unspecified? 


3. To what extent does a S's initial success or fail- 
ure (relative to his classmates’ perfor mance) 
influence his subsequent performance on a com- 
petitive task? 


4. Does competition in an educational setting have 
the same effect on both power and speed tasks? 


dieg Plications and carefully designed follow-up stu- 
uation be pursued if practical directives for ed- 
the Onal competition are to be formulated. Under 
of ii esent System of education there is little hope 
Ori oring competition, questionable value in de- 
its i ng its Presence, and no chance of eliminating 
Profita Се" On the other hand, it would seem both 
the eff le and practical to research in greater detail 
рце eets of competition both as a prevailing atmo- 
Culture resulting from the present educational and 
ments = Patterns and as specific motivational treat- 
Which may be used in classroom situations. 


FOOTNOTE 


Т 
y: нөр is grateful to Professors Т. Anne 
cary and G, William Walster for their ad- 


vice and assistance in selecting and developing 
the methods of analysis and in interpreting the 
data, 
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ABSTRACT 


College students voluntarily took all their courses or one course on a pass-fail basis. 


The mean grade 


point average ( GPA ) before conversion to pass-fail for freshmen taking all their courses -fai i 

was 1.67 (C-), which is significantly lower than the 2.26 (C+) for СОТО who wanted erue eror ennt 
Lees Even after returning to conventional grading the former pass-fail students continued to get significant- 
BUNT grades than controls. Juniors taking one course on a pass-failbasis received significantly lower grades, 
efore conversion, in their pass-fail course ( mean 2.07) than did controls who wanted but were denied pass-fail 


&rading ( mean 2.40). 


ONE PURPOSE of the college experience 
Should be to develop in each student an intrinsic mo- 
tivation to learn. In actual practice, however, most 
colle Bes use extrinsic grades to motivate students to 
earn. After graduation grades are no longer avail- 
able, and intellectual activity often ceases. 


ы grades and academic dismissal are the pun- 
a ments for not studying, and high grades and hon- 
TS are the rewards for successful study. Grade 
pressure is especially severe for marginal students 
sud want to avoid academic dismissal and for better 
ate ents who are competing for admission to gradu- 
See! This system encourages students to se- 
ihe Courses which promise high grades for a mini- 
Often Of effort, ‘The attainment of high grades is 
FORE berceived not as a key to success, but as suc- 
Gan itself” (2:179). The traditional grading system 
Meu be faulted for its emphasis on information 
ар) Бай than understanding, competition rather than 
ацргесіаноп, and quantity rather than quality. | In 
Ors ion, grades are inconsistent as not all instruc- 
use the same grading standards. 


prea үагіму of remedies have been proposed for the 
пірі med undesirable emphasis on grades. AtBen- 
mente | and Sarah Lawrence Colleges periodic com- 
fectiy, by instructors have replaced grades. The ef- 
staan of this system has unfortunately not been 
a bette, ely evaluated, but even if it did prove to be 
Systen е ПОЧ of evaluation than the present grading 
ms, meaningful individualized comments would 


There was no compensatory improvement in the grades received innon-pass-fail courses. 


be impossible with the high student-faculty ratio 
found at most institutions. An alternative method 
for de-emphasizing grades that seems more feasible 
for institutions with high student-faculty ratios is 
pass-fail grading. In this system, the number of pos- 
sible grades is limited to two; P for pass and F for 
fail, Pass-fail grading removes the external moti- 
vation for non-failing students and, intheory atleast, 
allows these students to study out of intellectual cu- 
riosity, rather than for grades. Pass-fail grading 
allows a student to de-emphasize, without penalty, 
aspects of a course or even entire courses that do 
not interest him. In theory, the student then directs 
his intellectual efforts to topics that are more con- 
sistent with his interests. Thismay include academ- 
ic areas in which he would avoid conventionally 
graded courses for fear of a low grade. 


Various forms of pass-fail grading have been ini- 
tiated at many American colleges. During the fall 
of 1967 we distributed a questionnaire about grading 
procedures to fifty-eight selected colleges. Based 
on newspaper and journal reports, rumor, and pure 
speculation, it was suspected that these schools had 
some form of pass-fail grading. In addition, the 
questionnaire was sent to all fourteen 4-year units 
of the State University of New York. Of sixty-three 
replies, thirty-seven (59%) indicated that they did 
offer some form of pass-fail grading, twelve (19%) 
others were considering it, and the remaining four- 
teen (22%) had no provision for pass-fail grading. 
That pass-fail is a relatively new grading system is 
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evidenced by the fact that only one of the schools had 
been offering it for longer than 3 years. A variety 
of pass-fail grading practices were found to exist. 

A few schools, including Massachusetts Institute of 
Technology, Yale, and Antioch, use pass-fail grad- 
ing exclusively. In all but seven (19%) of the schools 
offering pass-fail grading, courses that a student was 
allowed to take on a pass-fail basis were limited to 
one course per semester and fourper college career. 
Of the schools that offered pass-fail grading, nine- 
teen (51%) offered it to all undergraduates and sev- 
enteen (46%) offered it only to upperclassmen. In 
most schools, pass-fail was available in all courses 
offered. However, thirty (82%) limited the option 
to elective courses outside the student’s major. At 
Brandeis University a questionnaire (1) was sent to 
participating students and faculty. The results indi- 
cated generally uncritical support for pass-fail grad- 
ing. 


Our questionnaire asked each school to evaluate 
the success of its pass-fail grading. Most of the 
comments on the overall success of pass-fail grading 
were subjective impressions. Of the thirty-seven 
schools offering a pass-fail option, seventeen (49%) 
felt that pass-fail had achieved some degree of suc- 
cess, seventeen (49%) felt that it was too early to 
tell, and only one (2%) judged the system to be un- 
satisfactory. The criterion used was student-faculty 
acceptance, rather than measurable intellectual ac- 
tivity. 


The apparent popularity of pass-fail grading seems 
to indicate that grade pressureis unpleasant. Reduc- 
ing unpleasant grade pressure may be a desirable 
goal; however, the academic consequences of pass- 
fail grading must also be considered. The present 
study is a controlled evaluation of the effects of both 
one-course and complete pass-fail grading on aca- 
demic performance. 


METHOD 


The Ss were students at Cortland College of the 
State University of New York. Virtually all were 
New York State residents. 


During the summer of 1967 a stratified sample of 
293 entering college freshmen with low (379 to 477), 
middle (511 to 559), and high (580 to 785) Scholas- 
tic Aptitude Test Verbal (SAT-V) scores were 
matched by SAT-V score and sex and assigned to ei- 
ther group 1, all courses pass-fail; group 2, one 
course pass-fail; or group 3, a control group. Sim- 
ilarly, a stratified sample of 218 college juniors with 
low (1.9 to 2.1), middle (2.2 to 2.5), and high (2.8 
to 3.9) GPA’s as of June 1967, were matched for 
GPA and sex and then assigned to group2, one course 
pass-fail; or group 3, a control group. No juniors 
were assigned to group 1. All students were noti- 
fied during the summer that they had been selected 
to participate in a 1-year pass-fail study and were 
requested to attend a meeting on the first day of the 
fall semester. At this meeting they were told the 
purpose of the study, and unaware of which group he 
had been assigned to, each freshman indicated (1) 
whether or not he would accept an all course pass- 
fail option if given the opportunity; and (2) whether 
or not he would accept a one course pass-fail option, 
and if so, in which course he would use the option. 
There were no restrictions on which course could be 


selected. Juniors were offered only the one course 
pass-fail option. Students weretoldthat their choices 
would be binding. Finally, students were informed 
as to which experimental group they had been as- 
signed. Instructors were not told which of their stu- 
dents, if any, were to receive pass-fail grades. 
During the semester, the students taking pass-fail 
courses received feedback, in the form of examina- 
tion grades, as did their classmates. After final 
grades were submitted by instructors at the end of 
the semester, the appropriate A through D grades 
were converted to P ( Pass), and D- and E grades 
were converted to F (Fail). These are the only 
grades appearing on students' transcripts. PorF г 
grades were not used іп computing grade point aver 
ages, but P grades were credited toward graduation. 
The traditional A through E grades of pass-fail stu- 
dents were used for evaluation purposes only. Tes 
traditional grades submitted for students under 80 
pass-fail condition were compared with grades for 2 
the control group students that had requested the san 
pass-fail option but were not allowed to һауе it. 


It was hoped that under pass-fail grading eee 
experiences might tend to be oriented away from АЙЕ 
id compliance with course assignments or рашлы, 
for examinations. This nonconformity could pro far 
an initial deterioration in grades. However, pu 
as a college education is cumulative, these indep ijd 
dent learning experiences, if they occuratall, Еве. 
eventually lead to improved academic performan is: 
as the accumulated wisdom is relatedtonew co » 
To test for such a delayed effect, the grades of are 
all-course pass-fail group were studied for the ad- 
semester in which they returned to traditional qe 
ing (1968) and again for the first semester of ! 
Junior year ( Fall 1969). 


RESULTS 


Despite the supposed evils of conventional £7 ad 
ing, only 28 percent of the freshmen wanted to toy- 
all their courses on a pass-fail basis, while da 
cent of the freshmen and 80 percent of the junio 
wanted to take one pass-fail course. 

for 
Instructors submitted A through E (7) grade, 
all students and, where appropriate, the regis 
made conversions to P or F grades. 


All Courses Pass-Fail 


are 
The data for the all-courses pass-fail тошеру 

presented in Table 1 and Figure 1. The rr à 
before conversion for freshmen taking all t в сол 
courses on a pass-fail basis was 1.67 (C- » 52 wh? 
pared with 2.26 (C+) for the freshmen contro", t- 
wanted but were denied the same option (P ^ ;,qco? 
test). The difference between the all-courst үз. 
trol grades was greatest for the subgroup wis 
SAT-V scores. To summarize, all course P се. 
grading led to a decline in academic perform! 


te? 

қ me? ее 

Academic performance during the first 54 50 pre 
after the all-course pass-fail experience t no, 


These follow-up average? co? 
include data from seven experimental an form 
trol students who, after poor academic per with, 
withdrew or were dismissed from the CO” сЕ ral 
out completing 1 semester under conventio xo $ 
ing. In this follow-up comparison both Ёё 


sented in Table 1. 
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TABLE 1 


MEAN GRADE SUBMITTED FOR COLLEGE 
FRESHMEN TAKING ALL COURSES ON A PASS- 
FAIL BASIS 


FIRST PASS-FAIL SEMESTER GRADES 


Experimental Control 
SAT Verbal 
Score GPA N GPA N 
580-785 1.55 9 2.53* 8 
511-559 1.36 7 2.14 13 
379-479 1.91 13 2.15 6 
All Ss 1.67 29 2.26* 27 


FIRST FOLLOW-UP SEMESTER WITH 
CONVENTIONAL GRADES 


All Ss 2.28 22 2.72* 24 


SECOND FOLLOW-UP; FALL SEMESTER 
OF JUNIOR YEAR 


АП Ss 2.68 18 2.85 20 


* p.01 


received conventional grades. The only difference 
between the groups was the previous pass-fail expe- 
rience of the experimental group. Mean GPA was 
2.28 (C+) for the pass-fail group and 2.72 ( B- ) for 
the controls (p = .0l,t-test). Thus, taking all 
Courses on a pass-fail basis for 1 or 2 semesters 
also impaired subsequent academic performance un- 
der traditional grading. One year later, in the first 
Semester of their junior year ( Fall 1969), the mean 
GPA was 2.69 for the pass-fail group and 2.86 for the 


FIGURE 1 


MEAN OF GRADES SUBMITTED FOR FRESHMEN 
TAKING ALL COURSES ON A PASS-FAIL BASIS. 
CONTROLS WANTED BUT WERE DENIED PASS- 
FAIL GRADING 


40 


| ALL COURSES PASS-FAIL 


ГЛ ишш 


GRADE POINT AVERAGE 


PASS-FAIL FIRST SECOND 
SEMESTER FOLLOW-UP 


FIGURE 2 


MEAN OF GRADES SUBMITTED FOR FRESHM: 

ЕМ 
AND JUNIORS TAKING ONE COURSE ON А PASS- 
FAIL BASIS. CONTROLS WANTED BUT WERE 
DENIED PASS-FAIL GRADING 


GRADE POINT AVERAGE 


40 
FRESHMEN JUNIORS 


30 
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PASS: FAIL ALL OTHER PASS-FAIL ALL OTHER 
COURSE COURSES COURSE COURSES 


controls (p >, 05, t-test). These averages do not 
include data from eleven experimental and seven con- 
trol students who withdrew or were dismissed from 
the college during the study. 


One Course Pass-Fail 


Mean grades for the one-course pass-fail option 
are presented in Table 2 and Figure 2. Forallranks, 
the mean A, B, C, D, E (+) grade submitted for the 
pass-fail course was lower than the mean grade for 
controls in the course they wanted but were not per- 
mitted to take on a pass-fail basis. The difference 
was significant for juniors alone and for freshmen 
and juniors combined (p <.05, t-test), but did not 
reach significance for the freshmen alone. Grades 
in the non-pass-fail courses, that is, all courses ex- 
cept the pass-fail choice, showed no significant dif- 
ferences between experimental and control 8s. Thus, 
the one course experimental group students failed to 
show significant compensatory improvement in their 


non-pass-fail courses. 


Within Group Comparisons 


With data for freshmen and juniors combined, the 
students taking one pass-fail course received signif- 
icantly lower grades in their pass-fail course than 
their non-pass-fail courses (p< .001,sign test). 
However, the control students also received signif- 
icantly lower grades in the course they wanted but 
were denied the opportunity to take on a pass-fail 


basis ( p < .001, sign test). 
DISCUSSION 


Student Acceptance 


The high level of student acceptance of limited 
pass-fail grading reported by other institutions was 
reflectedin the present study, with 78-80 percent of 
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TABLE 2 
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MEAN GRADE SUBMITTED FOR STUDENTS TAKING ONE COURSE ON A PASS-FAIL BASIS 


FRESHMEN 
Pass-Fail Course Non Pass-Fail Courses 
SAT Verbal Score ТУТ N Control N p Experimental Control 
580-785 1.85 25 2.22 20 n.s. 2.53 2.43 
511-559 1.69 24 1.85 24 n.s. 2.33 2.15 
379-479 1.47 22 1.48 21 п.5. 1.83 1.91 
АП Ss 1.67 71 1.83 65 п.5. 2.23 2.15 
Cumulative GPA JUNIORS 
2.8-3.9 T 2.54 26 2.67 16 n. а, | 3.25 3.30 
2.2-2.5 1.92 31 2.43 37 5.05 2.72 2.76 
1.9-2.1 1.83 32 2.23 33 7.05 2.46 2.60 
All Ss 2.07 89 2.40 86 =.05 2.78 2.80 
FRESHMEN AND JUNIORS COMBINED 
All Ss 1.89 160 2.15 151 7,05 2.94 2.52 n.$: 


the students electing to take one course on a pass- 
fail basis. However, pass-fail grading was inten- 
tionally made attractive in order to entice a large 
number of students to participate in the study. Ifa 
student passed a pass-fail course with a low grade, 
he received full academic credit, but the low grade 
was not averaged into his GPA. Ifhefailedthe course 
he naturally received no academic credit, but the 
failure still was not averaged into his GPA. In or- 
der to benefit from the option, the student had only 
to be able to select a course in which he would geta 
low grade. Grades achieved by the control groups 
in the one course they wanted, but were denied the 
opportunity to take on a pass-fail basis, were infact 
lower than the grades in their other courses. Thus, 
students elect as pass-fail choices courses in which 
they anticipate low grades. Thus, ifstudents receive 
low grades in pass-fail courses in uncontrolled stud- 
ies, this may merely be due to their skill in select- 
ing courses in which they would perform poorly any- 
way. 


Pass-fail grading reduces grade pressure and 
thereby permits the student to divert some of his 
energy away from grade-oriented studying. Is this 
released time used to pursue additional academic in- 
terests, or does the overall level of academic activ- 
ity decline to the minimum necessary to obtain the 
pass grade? The follow-up grades for the all-course 
group provide a partial answer to this question. If 
during the all-course pass-fail experience students 
had spent test preparation time on intellectual pur- 
suits, then that activity should have been reflected 
in the following year’s grades. There was no evi- 
dence that a year of all-course pass-fail grading was 
subsequently advantageous. On the contrary, during 
the first follow-up semester on conventional grades 
the all-course group earned significantly lower grades 
than the control group that had wanted but had been 
denied the same pass-fail option. For the second 
follow-up year the results were in the same direc- 
tion but were not statistically significant. 


Somewhat similar results were obtained with 


one-course pass-fail grading. Compared to the Con 
trol groups, juniors taking one pass-íail course b Я 
tained lower grades іп that course, while their ре 
formance level remained unchanged in their other 
courses. Thus, the data suggest that time меге 24 
from the опе pass-fail course was not spent on r 
ularly graded courses. 


In order to have the appropriate control gno. 
the students participating in this study could nO» gf- 
told which experimental group they were in unti eei 
ter they completed academic registration. Ho ad- 
one of the theoretical advantages of pass-fail Bres 
ing is that it encourages students to take cour 
they would not otherwise be willing to attempt., the 
order to facilitate and at the same time evaluat 
use of this advantage, we hada pass-fail table g 


take a different course if they were in the г 
group. These students were to obtain class сат am 
to cover both possibilities, after which We ке” that 
mediately tell them which group they were үре 
they could complete their registration. 0 dic?” 
511 students involved in the study, only 006 ш 

ted that he would take a different course if h® пої 
pass-fail option. Thus, pass-fail grading ours?” 
encourage students to take more challenging © 


-fail 

The academic decline observed under p35* f E 
grading in this study may be attributed to t have 
dents' previous experience. To students W gh 
been extrinsically motivated throughout their селі 5 
School education, pass-fail grading may Е. reas?) 
only an escape from serious study. For ot if e 
pass-fail grading might prove more benefit gr” 
stituted earlier in the student's career, 96 
motivation becomes an obstacle. о” 


up | 


Pass-fail grading probably will not solver $U 
lems of academic evaluation, but if initial tow? 
а way as to minimize its abuses, it is а г ffer g^ | 
alleviating grade pressure. It may even 9 inst 
dents the opportunity to develop an 1" 


"m 
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motivation to learn. The present study did not ade- 
quately measure this change, if it occured, since ex- 
trinsic grades were the sole evaluative criterion. 
However, there was no evidence in the follow-up da- 
ta of any long term benefits. If pass-failis to be 
implemented at all, limits should be placed on the 
option. The advantage of reduced grade pressure 
may appear to some to outweigh the decline in per- 
formance in courses such as terminal electives not 
in the student’s major. 


In conclusion, the data suggest that what appears 
to be a trend toward pass-fail grading may be unwar- 
ranted. Students have learned how to work for 
grades and appear to learn a little in the process. 

It is as yet doubtful whether many have discovered 
how to learn without grades. 


FOOTNOTES 
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ABSTRACT 


" uh А is" 
Methodological problems have limited the usefulness of findings from experiments into learning by di 


covery. By using programmed instruction materials, a within-class design, and other controls, an attemP 
made to remove confounding. Two tasks were used: concept learning and principle learning. For each task, 
separate 2x2x2 factorial design containing sixteen Ss in each cell was used. 
tional method ( egrule and ruleg), school grade (9and 5), and intelligence (high and average). А set of eig 
different measures, involving retention, transfer, and ease of relearning, was used for each task. 
that the egrule and ruleg methods did not differ significantly, and that interaction between instructional me 


and the other variables was low. 


IN EDUCATIONAL literature relating to the teach- 
ing of mathematics and science, the present fashion 
is the advocacy of learning by discovery as the most 
effective teaching method. This belief is not based 
on firm experimental evidence. In 1966 Wittrock 
wrote: 


Many strong claims for learning by dis- 
covery are made in educational psychol- 
ogy. But almost none of these claims has 
been empirically substantiated or even 
clearly tested in an experiment (26:33). 


Later reviews support this statement (10, 22). Meth- 
odological problems relating to a lack of careful spec- 
ification of the treatments used and to a lack of con- 
trol of possible confounding variables prevent the un- 
ambiguous interpretation of the results from most 
experiments. 


In a large number of studies, nonsignificant dif- 
ferences have been obtained. The recent experiment 
conducted by Tanner (23) is an example. Ninth— 
grade pupils (N - 360) were taught the principles of 
mechanics in three different groups: expository-de- 
ductive, discovery-inductive, and unsequenced-dis- 


с” 
inst? 
Independent variables were inst 


tt was! 


pot” 
covery. Using a variety of measures, including in^ 
transfer and retention, and a within-class des to con 
volving programmed instruction in an attemP! теге, 
trol confounding variables, it was found that пей? 
were no significant differences among the thre рел 
ods. In this experiment, however, as in à e the 
of other experiments into learning by discover ye” 
actual degree of learning was low, making in 
tation of the results more difficult. 


Феб 
It is possible that the insignificant finding үй 
present the true position, and that students ; 5167 
equally well from ruleg and discovery metho altel t 
nificant results could be due to confounding: ior, 2 
natively, one method may be generally vP ined du? 
more significant results have not been obta ack o 
to such factors as inappropriate measure, yack, 
definition and specificity of treatment, aero yi? 

control; or possibly one method may be ар not yet 
to certain subsets of learners, which Сек t^ 
been identified because no appropriate е gn? 


have been undertaken, In the present study» еб, 


2 
tempt was made to achieve a much higher зе of Pf, 
control. This was done through ERA pe two in 
grammed instruction booklets in which yer? 


structional methods had exactly the 8 
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content, organization, manual activity, and so оп. 
Approximately equal numbers from each school 
class were in the two instructional groups, to con- 
trol for previous types of learning experiences. Sub- 
jects were screened to ensure that they had no prior 
knowledge of the task. Subjects in both groups had 
thesametimeallocation, and were presented with 
items at a fixed rate in order to ensure that all Ss at- 
tempted all items. Efforts were made to optimize 
the learning for each group within this time alloca- 
и * Other sources of confounding were also elim- 
inated. 


| In very few experiments have operational defini- 
tions of the discovery and ruleg learning methods 
been used, In the present experiment, the egrule 
(examples followed by rule) and ruleg (rule followed 
by examples) methods were clearly specified; the 
only difference was the placement of the rule. 


It has been argued(10) that it is necessary to 
study the interaction of variables, since it is possi- 
ble that learning by discovery is more relevantto 
certain population subsets than to the universal set. 
In addition to instructional method, the variables in- 
vestigated were school grade, intelligence, and cat- 
egory of learning. School grade was selected since 
it was related to one of the few predictions concern- 
ing learning-by-discovery variables. Ausubel pos- 
tulated that ruleg methods were generally superior 
to discovery methods on a time-cost basis; he rec- 
ognized, however, the greater relevance of learning 
by discovery for: 


. children approximately below the age 
of twelve; . . . during the elementary-school 
years; ...for children who are still function- 


ing at Piaget's level of concrete ucro 
(1:23). 


It was hypothesized, therefore, that on a new task, 

elementary school children would perform relatively 

better on the egrule method, and that high school 

meee would perform relatively better onthe ruleg 
ethod. 


Although some studies have been reported in 
Which lower ability groups tended to learn relatively 
better from discovery methods than did high ability 
groups (10), this finding has not always been obtained 

14,23). It was hoped that the grade x intelligence x 
тето interaction in the present experiment would 
агу this inconsistency. 


z 
st Concerning category of learning, Gagne has 
a ed: ‘ The learning of concepts appears to require 
in Process of discovery..." while “ Principle learn- 
139 сап be done with or without discovery " (7: 149- 
тен Since, by definition, it is not possible to use 
па iBely the same subject matterfor both concept 
19 Principle learning, it was necessary to use two 
Could st tasks; any difference involving the twotasks 
4 thus be ascribed to factors other than category 


of 1 
arning (e.g, , difficulty). 


E Seems necessary to justify the specific selec- 
ber o grule and ruleg methods. There are а num- 
Sentatign thodological problems related to the рге" 
inhibit; of a rule following learning. Retroactive 
ton, the Zeigarnik effect (of superior mem- 
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ory for unfinished tasks ),and the Ovsiankina effect 
(of resumption of incomplete tasks) may possibly 
be of importance depending on the operations in- 
volved in the learning sequence, and must be taken 
into account. 


There seem to be five approaches with respect 
to this cluster of issues: 


(1) То present examples only (1. e., no rule) to 
the discovery Ss. This eliminates the possibility of 
retroactive inhibition affecting only the discovery 
group; two confounding problems are, however, in- 
troduced. First, is the confounding problem with 
respect to possible additional practice time for one 
group only, as reported by Kersh (12,13). Second, 
and possibly related to the first, is the opportunity 
for the Zeigarnik and Ovsiankina effects to operate. 


(2) Tousea much longer learning session. This 
would probably have the effect of reducing “© moti- 
vated practice’’ between the learning and testing ses- 
sions; Cronbach (4) quotes an experiment conducted 
by Kersh involving sixteen training sessions in which 
it was found that there was no difference between 
groups in the use of information outside the class. 
The problem of less control for long-term experi- 
ments with human Ss still remains. 


(3) To present the rule at the end of the learning 
session to the inductive group only. This introduces 
confounding, since only the egrule group would suf- 
fer the possibility of retroactive inhibition effects. 


(4) To present the rule at the end of the learning 
session to both groups. There would seem to be the 
possibility that theretroactive inhibition effect could 
act differentially with respect to the differ ent groups; 
for example, during the learning session the ruleg 
group could have been operating interms of a ^ rule, n 
while the egrule group could have beenoperating in 
terms of a *«method. ’' 


(5) To present the rule to the discovery group 
about half-way through the learning session. This 
would, it is hypothesized, considerably reduce the 
effects of retroactive inhibition, since thereistime 
for the rule to be assimilated while the Ss work 
through the remainder of the examples. Other ef- 
fects would appear to be eliminated since the ruleis 


presented to both groups. 


t approach was the preferred one, and 
so E тр was first presented with the rule 
(in each unit) just after they were half-way through 
the unit. This was achieved by developing the units 
as two modules. For example, (where R = Rule; 
E = Example; and A = Answer ) 


Module 1: R+E+A; E+A; R+E+A; E+A. 
Мосше = 


Module 2: E+A; E+A; E+A; E+A. 


For each unit, the sequences were: 


Module 1-Module 2. 


li oup: 
asp Module 2- Module 1. 


Egrule group: 
ule sequences were chosen 


leg and egr 
E e two most common methods 


since they represent th 
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TABLE 1 


MEAN AGES (YEARS AND MONTHS) AND IQ'S WITH RESPECT TO GRADE, INTELLIGENCE, INSTRUCTION- | 


AL METHOD, AND CATEGORY OF LEARNING 
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Grade 9 Grade 5 
HighIQ Average IQ High IQ Average IQ 
TASK Ruleg 
Egrule Ruleg Egrule Ruleg Egrule Ruleg Egrule u 
CONCEPT Age 
(Matricu- 2s 14-11 15-3 15-3 15-5 10-7 10-9 10-11 ue 
lation) S.D. 0-6 0-3 0-4 0-6 0-4 0-4 0-5 0- 
IQ 3 
Mean 123 119 102 101 119 120 104 10 
S.D. 6.5 5.6 1 7.4 6 5.7 4.3 5.9 
PRINCIPLE Age T 
(Map-Read- Mean 15-0 15-0 15-3 15-3 10-9 10-8 11-0 1 же 
ing) S.D. 0-3 0-5 0-5 0-5 0-3 0-4 0-4 0- 
IQ 1 
Меап 124 121 98 101 118 120 102 105 
S.D. 5.1 6.1 6. 5.5 3.8 5.3 5.1 6 


of teaching, viz the ruleg and the inductive methods. 
The specter of confounding was removed since, in 
addition to the resolution of the methodological prob- 
lems raised above, there was only one difference be- 
tween the groups, namely the stage at whichtherule 
was given. It should be noted that in Ausubel's ter- 
minology, both methods were reception learning, 
since the rule was given in each case, 


The objective of this experiment was to conduct 
a controlled investigation into the interaction of vari- 
ables in egrule and ruleg methods. For each of two 
separate tasks (concept and principle learning), a2x 
2x2 factorial design was used; the independent vari- 
ables were instructional method, school grade, and 
intelligence. Various retention, transfer, and г e- 
learning posttests constituted the dependent variables 
and were taken 4 weeks after the learning session. 


METHOD 
Subjects 


For each task, 16 Ss were tested in each cell (a 
total of 256 Ss). There were approximately equal 
numbers of males and females. Grades 5 and 9 were 
selected for the following reasons: 


(1) they are in the elementary and high schools re- 
spectively, 

(2) they correspond to two Piagetian stages, and 

(3) they appear to cover the maximum grade span 
for which suitable learning tasks could be de- 
vised, 


Since it was possible that geography students could 
have studied relevant material, non-geography stu- 
dents only were used as grade 9 Ss. Each S attempt- 
ed one task only. 


y 

The IQ ranges were 85-110 and 111+ for the Ta 
erage and high intelligence groups respe 
School records were used to obtain the IQ dar 
tails concerning the groups are shown in Table + 


Materials à 
ise 
The learning and testing materials were devine 
by the author, and were extensively pr etested true 
use. They consisted of, linear programmed YS a 
tion booklets and separate answer sheets, whi pro” 
tape recorder was used to ensure a fixed rate ech” 
gress through the booklets. Hints and fading p уәлі: 
niques were used ( for both methods ) wherere 
Both tasks were meaningful andnon-arbitrary: 


Principle learning task, A map-reading, кай ре, 
used. The four principles (listed in Table 2) pen 
rearranged in the form of Gagné’s (6:58) “бз 
paradigm, 


ig 
Each item in the booklet was written on tht os 
hand page; a dash indicated that a word or PY wet 
was to be constructed and written on the pt (0 
Sheet. The verbal content of the items wa ор 
aminimum. Opposite the item was the apP ack 0 
map, while the correct answer was on the ma 
each question page. To illustrate the task; end 
and its items (24-31) are presented in the ep 2 
A scale, consisting of printed black lines or polat? 
piece of Perspex©, was used by SS to in 
between the grid lines. af 
w 
Concept learning task. The concept US, с P i 
** matriculation to Sydney University, ne fout ^4) 
be considered as a classification rule. a 
fining characteristics of matriculation (see also: 4 
can be shown as a decision tree (eg. , 11 5 5, an 
concept would appear to be consistent wit 
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TABLE 2 


SUMMARY OF ORGANIZATION OF THE 
PROGRAMMED INSTRUCTION BOOKLETS 


Concept Principle 
INSTRUCTIONS 
Items 1-3 


These items are to give Ss experience in the lay- 
out of the programmed instruction booklets andthe 
answer sheets. 
Some basic 
terminology is 
introduced in 
these items. 
PRELIMINARY SCREENING TEST 


Two items similar to those at the end of the pro- 
gram. No answers are given, 


REVISION OF INSTRUCTIONS 
LEARNING 


Items 4-47 


4-11: Number of exam subjects 4-23: Easting 
12-23: Levels of exam subjects 24-31: Northing 
24-31: English compulsory 32-39: Six-digit 
position 
32-47: Agriculture or Mathe- 40-47: Principle 
matics or Science com- for third 
pulsory | digitbe- 
ingzero 


CHECK CN LEARNING 


48-49: Two questions with no answers given, as 
a check on whether S had simply been copy- 
ing down the correct answers 


definition (6:58). The task could be described as 
Опе of concept attainment (see 2). 


A tabular layout was used. At the top of each 
п hand) page was the background material neces- 
(азу for Ss to appreciate the type of criterion being | 
tought. (The right hand pages were blank.) The cri- 
а were related to the column headings, while each 

0% represented one item (i.e., a list of examination 
Subjects passed). Subjects recorded eac h answer 
as Ways a combination of ** yes ” and/or tC no") in 
ty, ™SWer booklet resembling the learning booklet; 
Pago ect answer was given on the next (left hand) 


ts Summary of the organization of the programmed 
in Tar don booklets for each task is pres ente 

verbal е 2. For each individual task, the examp. S 
differ, material, etc., are exactly the same; the only 
ence is the placement of the rule. The 
ts on the learning check items are 


nted in Table 3. 


resu 
Prese 


Posttests. Posttests A and C (different sets of 
tests for each task) were parallel forms. Eachtest 
contained about 50 percent retention, 25 percent 
near transfer, and 25 percentíar transfer items. 
The retention items were exactly the same as some 
ofthe items in the original learning task and, for 
posttest C, were exactly the same as some of the 
items in the relearning program. The far transfer 
items generally required Ss to perform operations 
which were the reverseto those required in the 
learning process; for example, whereas in thelearn- 
ing of the map-reading task, Ss were required to 
read the position from a map, inthe far transfer 
task they were given a position and asked to place it 
ona map. Allanswers had to be constructed; the 
scoring was based on the correctness of the answers. 


Relearning programs constituted posttest B. 
Omitting the three introductory items, alternate 
items of the remaining forty-four items of the rel- 
evant original learning program ( i. e., 22 items) 
were used. The arrangement ofthe learning pro- 
grams meant that items containing the rules were in- 
cluded (i. e., those items with even numbers). Both 
egrule and ruleg relearning programs were pre- 
pared. The answer to every second item was omit- 
ted, thus resulting in eleven test items. For the re- 
learning programs, two measures were employed; 
first, the correctness oí the constructed answers 
for the eleven items without answers given, and 
second, the time taken to complete the program. 
Separate answer sheets were used. Therewas no 
time limit on the tests individually oras a whole; 
the time, which was written on the chalkboard every 
minute, was copied down by Ss onto their answer 
sheets at the commencement and at the end of each 


of the three tests. 


That the various tests were effective was dem- 
onstrated by the data showing that on each of the 
fourteen tests used, at least two significant results 
were obtained at the . 01 level (with one minor excep- 
tion). It was therefore decided to undertake item 
analyses on only one set of parallel forms: Posttest 
A. Each analysis was performed on eighty scripts, 
ten scripts being randomly selected from each of the 
eight categories. Mean difficulty ranged from 0. 46- 
0. 55 and 0. 51-0. 71 for the matriculation and map- 
reading tasks respectively (where 1.0-all items 
answered correctly), Kuder-Richardson 20, (KR-20) 
reliabilities ranged from 0. 77-0. 96. Discrimination 


TABLE 3 


NUMBERS OF SUBJECTS PASSING THE LEARNING 
CHECK ITEMS 


Instructional — Grade 9 Grade 5 
iod Method High Average High Average 
IQ IQ 19 IQ 
Matric- Egrule 15 9 4 2 
ulation Ruleg 16 п 6 5 
-  Egrule 16 15 12 10 
eee ai 16 15 13 10 


Reading Ruleg 
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indices ( phi and point-biserial) were satisfactory. 
Procedure 


The experiment was undertaken in schools, the 
class group being the unit for learning and testing. At 
the elementary school level, nine classes at four 
schools were involved. At the high school level, ten 
classes of non-geography students at six schools were 
involved. It was anticipated that the possibility of 
communication would be decreased by having a low 
number of classes at any one school. For all ele- 
mentary school classes, the learning and testing ses- 
sions were held in the morning. 


For control purposes, it was considered that all 
Ss in the same grade should attempt all items inthe 
sametimeallocation. А fixed rate of presentation 
was thus used; available data from programmed in- 
struction experiments tend to indicate that there is 
no difference in attainment if a fixed rate of presen- 
tation, as compared to S's own rate, is used (15, 20). 
Following pretesting, when it was found that the ap- 
propriate time for grade 5 Ss was too long for grade 
9 8s, who became frustrated or bored, differenttime 
allocations were utilized (43 and 30 minutes respec- 
tively, not including instructions ). 


To control for background knowledge the (few) 
Ss who passed the preliminary screening test were 
excluded from the analysis. The final sample thus 
consisted of Ss who were randomly assigned ona 
within class basis. They attended the learning and 
testing sessions, had IQ data available, and failed 
the screening test. 


Learning session. The pupils were seated in 
their normal pattern. E was introduced by the prin- 
cipalor class teacher, who stated that E was inter- 
ested in the development of new teaching methods; no 
mention was made of discovery learning or of subse- 
quent testing. 


Before the material was handed out, E askedthe 
Ss to leave it unopened on their desks. The learning 
programs were distributed in such a way that approx- 
imately the same number of pupils were placed inthe 
various cells formed from the instructional method 
and category of learning variables. In order to re- 
move any tendency to “ cheat,” the matriculation and 
map-reading programs were handed out to alternate 
columns of Ss. То overcome any bias due to class 
arrangements, Ss in each column were alternately 
given egrule and ruleg programs. 


Ss were asked to write their names on the an- 
swer sheets, E then stated that there were two sep- 
arate tasks, and that each person would do one task 
only. The matriculation task was identified, andthe 
definitionof matriculation was read from the front of 
the booklet. Using sheets of cardboard, tied together 
and otherwise set up to resemble the first pages of 
the programmed instruction booklet, E identifiedthe 
following: position of question (and the red box sur- 
roundingit), position of question number, columns 
and rows constituting the question, position of answer 
on subsequent page (together with its green box), and 
position of next question. E told Ss that the answer 
to each part of each question was either “© yes” or 
“по.” Again, using sheets of cardboard, E identi- 


fied the following for Ss doing the map-reading task: 
position of map, position of question, 
position of question number, and position of an- 
swer on next page. E emphasized to Ss that all an- 
swers were to be written on the answer sheets. They 
were told to read the printed instructions to them- 
selves while E read them aloud. The instructions 
were: 


An attempt is being made to improve teaching 
methods, and your help will be greatly appre- 
ciated. 


In this booklet are a number of questions. 

Work out the answer, write it down inthe space 
provided, then see if it is exactly the same 
as the correct answer given on the next page- 


If your answer is different, or if you haven't 
worked out the answer before the bell, qu ickly 
try to work out how to get the correct answer. 


Go on to the next question when the number is 
called, even if you have not worked out the ап 
swer; the following questions will help you. 


If you finish before the number is called, think : 
about the method for getting the correct answer 
do not go on to the next question until the num 
ber is called. 


Do the best you can. 


A further instruction was added. “ Commenc hen 
you hear the number 1 onthetaperecorder. » gt 

pressed the appropriate button on the tape 
On the cassette were the numbers at the inte 
specified. Immediately before each number "T for 
warning bell. Depending on the time allocatio cond? 
the item, another bell sounded 7 1/2 to Saout” 


before the warning bell for the next item (t ne c9 

age students ““ to try to work out how to get t 

rect answer ” as stated in the instructions )- А 
» th 


Immediately before the sounding of ‘ fours read 
tape recorder was held, and after asking 55 x É 
the following printed instructions to themselVe?' 
read them aloud, as follows: 

ows 

We will stop to make sure that everyone kn 

exactly what to do. 


ewer? 
1. (a) When you have worked out the 3" К 
write it down. 4 ве® B 

(b) Look at the correct answer, ап 


our answer is exactly the same. , gyt, 
(c) If your answer is different, quickly сі ай 
work out the method for getting the CO 
swer; do not change your origina. 
(d) Go on to the next question when 
ber is called. ave’, 
er 

2. (а) If you haven't worked out the a ef s 

the bell sounds, look at the correct B getti” 
quickly try to work out the method 3 


the correct answer. И nthe pu” 

(b) Go on to the next question whe 

ber is called. 4 
the т 


Р А ng; 
3. Don’t worry if your answer 18 wrong; 
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TABLE 4 


EXPERIMENTAL RESULTS 


Measure Grade 9 Grade 5 


High IQ AverageIQ HighIQ Average IQ 


Egrule Ruleg Egrule Ruleg Egrule Ruleg Ергше Ruleg 


Matriculation (C) 


A: Retention Mean 28.69 26.25 17.63 20.06 16.69 21.88 17.75 14.94 
32)* 
(32) S.D. 3.96 4.49 1. 83 8. 44 6.77 4.82 5.91 7. 66 
А: Far Transfer Mean 15. 25 15. 25 11.19 13. 06 8.69 10.38 8. 88 8. 81 
(19)* S. D. 3. 69 3.92 3.94 3.97 4.54 4.54 3.50 5.10 
A:Near Transfer Mean 16.50 16.38 12. 63 14.13 9. 63 9. 88 7. 69 8.00 
(20) * S.D. 1.10 1:45 4.40 3.10 4.47 5.14 3.34 3.33 
В: Correct 
Responses Mean 29.38 28.31 23.56 25.38 23.94 24.63 19.00 19.56 
* S.D. 1.09 3.67 5.24 5. 15 4.27 — 4.95 4.50 5.34 
B: Time Mean 8.06 9.25 11.31 10.50 18.75 17.13 17.06 13.31 
S.D 8.27 2.05 4.09 3.18 8.67 3.42 6.71 4.01 
C: Retention Mean 31.31 30. 06 26. 25 28. 63 26.94 25.94 19. 00 22.00 
(32)* S.D. 1.74 2.98 5.01 3.88 5.17 — 4.87 6.08 5.94 
С: Far Transfer Mean 17.13 16.38 11.63 12.38 10.31 12.88 7.19 1.15 
(19)* S.D. 2.55 3.74 3.98 4.77 5.64 4.41 8:71 5,15 
С:Меаг Transfer Mean 14, 88 15. 63 12.00 13. 75 11.19 11,75 8. 00 8.06 
(20) + S.D 2.92 1.78 3.72 3.24 3.25 3.32 3.45 3.04 
Map-Reading (P) 
А: Retention Mean 41,56 41.13 30.06 34.75 24.25 30.06 18.88 21, 81 
(ат) * 8.Р. 7.59 7.46 13. 90 10.43 14,45 15.37 15.72 14.95 
A: Far Transf M 21.38 20.06 12.25 11.00 1.88 7.69 5. 44 3.13 
(a). "UT gn. 7.04 6.79 9. 88 8.12 826 650 51 5. 20 
А: Near T 21.88 20.13 15.94 18. 63 13.25 16.44 10.00 10. 81 
(23) * ransfer Nes L 78 4.62 8.58 5.70 6. 67 7.96 8. 45 7.35 
В: Corr 
ect 
48. 06 32.94 33. 44 
Responses ^ Mean 51.00 50.81 45. 50 46.06 45.94 16.08 
(53 )* S.D. 1.90 2.01 12. 40 8.13 8.05 5.1 15.18 
B: Ti 16.44 16.06 17. 63 16. 25 
‘Time Mean 9.38 8.13 10.81 пи ga ux р ree 
S.D 1.89 2.42 3.10 
29.00 
С: Retenti 38. 63 38.25 33.94 41.38 25.31 і 
ion Mea: 4456 44.13 : 8.40 14.66 15.38 
(46) 3 D. 132 2.13 12.58 9.16 13.16 
С: Far 12.69 11.00 12.75 5.00 2.94 
Transfer Mean 24,13 22.44 15. 50 7.93 8. 40 4. 95 5. 23 
(ey. 7 S.D. 3.74 6.13 10.15 9.21 - 
: kg Trans- 18.3] 16.38 19.31 11.88 13.31 
(23) Mean 21.88 20.94 19.06 $35 1076 40) 68 7.10 
S.D. 1.21 3.24 5. * 


* I ж 
пас 
ates maximum possible score 
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are clues in the following questions which will 
help you to work out the correct method. 


4. You will not be able to ask any questions when 
you start again. Have you any questions now?” 


The vast majority of Ss worked consistently, per- 
haps due to the novelty of the program and/or the 
constant pace demanded by the tape recorder. At the 
end of the session, E thanked them for doing so well, 
and asked them not to discuss the work in the book- 
lets with anyone. (Ss were not told that they would 
be tested later. ) 


Testing session. Testing was held exactly 4 
weeks after the learning session, generally at the 
same time of day. Due to school programming dif- 
ficulties, one of the nine elementary school and two 
of the ten high school classes were tested 1 day ear- 
ly, and one small high school class was tested 2 days 
early. 


To save time and prevent possible disorganiza- 
tion, Ss' names were written in advance on posttests 
А and B. Ona random basis within each class, Ss 
were assigned to the egrule or the ruleg relearning 
program (i.e. posttest B). 


The following instructions were given: 


TABLE 5 


VALUES OF F ( ABOVE 1) 
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There are three separate parts to today's 
work. The first is to see how well you re- 
member the work you did last time I was 
here. The second part is similar to what 
you did last time, and you will be working 
through the booklets in the same way as last 
time; remember to put your answers onthe 
separate answer sheet. The third part i5 
like the first. At the start and end of each 
part, write down the time from the board 
(signify position). Put up your hand when 
you finish each part. Try each question, 
but if you can't answer a question, go onto 
the next question; don't spend too long on 
any one question. Do the best you сап. 
Commence now. 


As each S finished one of the posttests, E gor 
lected it and gave him the next posttest; this was s 
ensure that the rules forming part of the relearnin 
program (posttest B) could not be referred to by 
when doing either posttest A or C. 


RESULTS 
518 
The set of results is shown in Table 4. Analy ot 
of variance computations were undertaken. Тез еи 
homogeneity of variance were undertaken (BR А 


method); where appropriate, transformations 


_ POSTTEST 
A B С T 
Factor Task Retention Transfer Relearning Retention тгапзіе Neat 
Far Near Correct time Far 
2,0 
Method (M) © 1.4 2.2 
P 2.0 1,5 1.7 1.3 “* 
1.5 
Intelligence (I) С 25.8** 6,7% 15.8** 33,8** 31.3**  32.1** za 9” 
Р 9,8% 20.9% 11.64 — 25,8** 4,1+  15,9** — 43.5** ". 
0.9 + 
Grade (G) c 22.1**  37.1** 95.5** 36.5**  63.9** 46.3** 38. 2** i ? 
P 30.2** 52, 5** 27. 8** 18,4%% 88,7%% 19. 0** 56. 7** 
MI @ 1.5 5.4* 
Р 
MG с 2.9 2! 
P 1.6 
IG с 6.3* 2.1 8.7** 2.7 2% 
Р 1.9 6.3* T4 1.8 
MIG c 8.0* 15 1.2 
P 1.4 
C - Concept Learning Task ( Matriculation) 
P - Principle Learning Task (Map-Reading) 
** - Significant at 0, 01 level 
Ж ox 


Significant at 0. 05 level 
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made to the data (e.g. transformation of,/x + 0. 5 
recommended by Bartlett (5), and further analysis 
of variance determinations computed. It was found 
that there was no difference in level of significance 
on any effect before and after the transformations; 
the given values of F in Table 5 are from the analy- 
sis with the lowest F levels, 


Grade and Intelligence. As expected, grade 9 
Ss performed significantly better than grade 555, and 
high intelligence Ss performed significantly better 
than average intelligence Ss. 


Instructional Method. There were no significant 
differ ences between egrule and ruleg methods. Based 
on the significant and nonsignificant interactions, 
there would seem to be a slight superiority of the 
ruleg method for all groups except the grade 9 high 
intelligence group. Considering the large number of 
measures employed, the two significant interactions 


involving instructional method could be due to chance. 


Category of learning. Based on graphic plots, 


significant interactions between category of learning 
and instructional method were obtained on the two 
Sets of far transfer measures. For posttests A and 
С, on the far transfer measures, the performance 
of the egrule group was superior to the performance 
of theruleg grouponthe principlelearningtask, con- 
trasted to the superiority of the ruleg group on the 
Concept learning task. 


Sequence of learning. The utilization of the re- 


learning measures enabled analysis into the effec- 
tiveness of the various sequences of learning, namely, 
egrule-egrule, egrule-ruleg, ruleg-egrule andruleg- 
ruleg. Forthesixanalyses (fromthe three sections 
of posttest C for each task), not one main effect or 
interaction involving instructional method was sig- 
nificant. 


DISCUSSION 


, Тһе results of no significant differences between 
Instructional methods, and insignificant superiority 
9f the ruleg method, are similar to other recent find- 
ings ( 23, 24, 25), 


The interaction effects involving instructional 
method with grade and/or intelligence were, with 
wo exceptions, insignificant. These exceptions, to- 


Sether with an analysis of the complete set of results, 


Suggest that the egrule method is relatively more 
Suitable for older students of high intelligence. From 
à recent apparently well designed experiment, Tan- 
пег (23) reported similar results: nos ignificant 
stain effects were found with respect to method of in- 
” Tuction, while there was a tendency for the discov- 
ind method to be relatively more effective for higher 
a еШвепсе Ss. The present results do not support, 
h Number of experimental findings (3, 9, 18) inwhic 
һаз been found that a discovery method is relative- 
i a eer ior for average intelligence Ss. The present 
ponas question the prediction from Ausubel’s m 
for hee of superior results from the ruleg metho! 
the ‘gh school students, and superior results from 
Brule method for elementary school children. 


trag et pretation of the interactions for both far я 
* measures between instructional method an 


category of learning is difficult. Certainly the find- 
ing does not confirm the prediction from Gagife's hy- 
pothesis concerning superior results from the egrule 
method for the concept learning task. This is pos- 
sibly due to the differences in the tasks, especially 
the differences in difficulty level between the two 
tasks. 


The reasons for the lack of significance would 
appear to lie principally in the treatments. The 
tasks were non-arbitrary. Subjects participated in 
the experiment only if they failed the preliminary 
Screening test. For one task (map-reading), both 
instructional groups performed equally well on the 
learning check items, indicating that this known pos- 
sible problem of different amounts of learning was 
negligible for this task. The tasks, especially the 
map-reading task, were not too difficult; neither 
would they seem to have been too easy. Ahigher 
degree of control than that of many previous exper- 
iments was established by using programmed in- 
struction booklets with the same verbal content and 
organization, and by randomly assigning Ss to treat- 
ments within classes in order to control class dif- 
ferences in attitude, achievement, and previous ex- 
perience. The time spent by both instructional 
groups was exactly the same. The measures em- 
ployed would appear to have been capable of discrim- 
inating any significant difference. 


Why do the present results fail to confirm the 
findings obtained by Gagne and Brown (8), Scandura 
(19), Kersh (12), Ray (16) and Rowlett (17), who re- 
ported the majority of results significantly favoring 
the discovery method? The Ss in the discovery 
group in the experiments conducted by Gagrie and 
Brown(8), Scandura (19) and Kersh ( 12 ) spent 
more time learning the material than the Ss in the 
corresponding ruleg group. In addition Ss in 
Kersh's (12) group spent more time on the task be- 
tween the learning and testing sessions. Ray's (16) 
discovery group spent more time in manual activity 
than the ruleg group on a micrometer-reading task 
described by Grote (9) as ** manipulative ” in nature. 
From replications of Rowlett's (17) experiment un- 
der different conditions by Rowlett (18) and by Suess 
(21), insignificant results have generally been ob- 
tained. The specification of the differences between 
the discovery and ruleg methods in most of these ex- 
periments has been inadequate. Grote, who worked 
at Illinois at about the same time as Ray and Rowlett 
stated that his experiment was an extension of those 
by Ray (16) and Rowlett (17). Hehadthis to say 
concerning the three experiments: 


A primary difficulty rests in the ambiguity of 
the nature of the methods used by the experi- 
menters. One cannot say, with any degree of 
confidence, that the directed discovery method 
in the various studies is similar or dissimilar, 
or that it contains the features or character- 
istics ascribed to the method by each investi- 


gator (9:119). 


tion of the previous experiments, if 
"vw gd Pave confounding variables interacting with 
the instructional method, in addition to other meth- 
odological problems. The weight of evidence for the 
significant superiority of the discovery 


method is thus very limited, 


30 THE JOURNAL OF EXPERIMENTAL EDUCATION 


It is possible that a factor in the experiment 
could have resulted inone or the otherof the groups 
being placed at a disadvantage. First, with respect 
to the ruleg group, it is possible that if less time had 
been allowed, ruleg group members may still have 
performed as well, but the egrule group members 
may have performed less well. Second, there are 
a number of factors which could be advanced as being 
to the disadvantage of the egrule group: 


(a) The possible depressing effect of the pre- 
sentation of the rule to the inductive group has been 
noted. However, the rule was first presented not at 
the end but about 60 percent of the way through each 
unit; the possibility of retroactive inhibition would 
thus be expected to be minimal. 


(b) Programmed instruction may possibly not 
be the most appropriate way of catalyzing discovery. 
In experiments in which programmed instruction book- 
lets have been used for both instructional groups (e. 
g., 23) the result has generally been insignificant. 
Perhaps discovery methods are most eífective when 
a teacher is providing encouragement. ( The value 
of programmed instruction with respect to controlis, 
of course, immense. ) 


(c) Perhaps there was insufficient time and/or 
number of examples for egrule groups (apart possi- 
bly from the high intelligence grade 9 Ss) to ** dis- 
cover ” the rule. 


(d) The lack of experience in discovery meth- 
ods and/or the lack of specificity in the instructions 
may have adversely affected egrule group members. 


(e) The set rate of presentation may have af- 
fected the egrule group differently than the ruleg 
group. 


(f) As the tests present a structured discovery 
form of learning of the ET E^ category (i. e. ,rule 


not given, answer not given), the ruleg Ss would also 
have had some discovery learning and could possibly 
have “ discovered "the rules on one of the tests. 


Many further avenues of research would appear 
useful А number of extensions to the present exper- 
iment are possible: 


An investigation intothe significant interactions 
on the category of learning variable to determine 
whether these results were related to the difficulty 
of the task, the organization of the material, or the 
category of learning. 


An investigation to determine what degree of 
specificity of instructions is required to optimizethe 
performance of discovery Ss. 


An investigation to determine what pre-training, 


if any, is required to optimize the performance oí dis- 
covery Ss. 


Ап investigation using Ss from grade 3 or 4. 
Ausubel suggested that approximately 12 years was 
the upper age at which discovery methods are most 
appropriate; perhaps the age of inflection is 10 years, 
assuming Ausubel's thesis to be valid. Possibly men- 


tal age should also be considered. 


Analysis of the responses of Ss in discovery 
groups could serve as a basis for developing hypoth 
eses. If programmed instructionbooklets W ere 
used, this could be done by omitting the answers to 
some of the items. Such data would help to deter- 
mine whether the Ss had learned the rule before на 
first presentation, the type of strategy used, an 
those factors, such as perseveration, which appear 
relevant when Ss learn a wrong rule by a discovery 
method. 


Assuming that some forms of discovery learning 
are effective, an attempt could be made to analyza 
possible mechanisms. For example, it is possible 
that factors associated with “ incidental learning 
could also be relevant to discovery learning. 


Personality factors could be very rel event 
to learning by discovery. For example, what is 
effect on people of different anxiety levels of being 
placed in an ** unstructured” learning situation. 
arn” 


If a grand research strategy to investigate 16 sid" 


ing by discovery were to be undertaken, it i$ сае 
егей that sets of reference lesson units in var houli 
subject areas and for various difficulty levels $ uld 
be developed; appropriate pre- and posttests sho 
also be prepared and standardized. 
earning”! 
an 
nthe 


For any research project relating to 1 
discovery, it is considered that there should br 
emphasis on specifying the operations involve the 
instructional method, on attempting to improve, the 
terminology, and on attempting to overcom 
methodological problems. | 
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APPENDIX 
EXAMPLES OF MAP-READING MATERIALS 
(a) Learning Program 
Items 24-31 
Map 2 ( Appears at the top of the next page). 


Corre- Corre- 
sponding sponding Time 
egrule egrule secs) 
example Ruleg item item * paraos 
(E24) (R24) Оп Map 2,wewill E28 50 75 
Long use the horizontal (orside- 

Bay ways ) lines to find the north- 


ingof some suburbs. 


On Мар2, the position of 
Long Bay, withrespectto 
the horizontal lines, is found 
by looking from bottom to 

top (from southto north 
and by finding the third digit 
using the scale. 


The northing of Long Bay is ы 


nnn 
(25) (25) Carefully lookatthe 25 30 45 
scale on the mapto see how 

the scale is used to find the 


northing of Bondi. 
The northing of Bondi is 
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32 
MAP 2 
(Smaller than actual size. ) N 
м Е 
5 
= Ж == 
к = е [nda ы й © 
| 9 T 
| | Mone Vale 
I —— 
| | 
3 aa алала 
~ 1-1-8 
| i 
| | 
„ЕР 
=y icc pune Bree 
! ! 
i 
i c M — 
| 
1 
| 
O | i К ле 
LH o P fone @ 
(E26) (R26) ForNewport, the ЕЗ0 30 45 (29) (29) Inamapreference, 29 30 
New- northing is found by count- Mona there are 6 digits; three dig- 
port ing 9 units upwards ( to - Vale its refer tothe easting and 
wards the top) from the 83 three digits refer to the north- 
horizontal line. ing, 
Check on Map 2 using the The northing of Mona Valeis — — 
scale. 
Thenorthing of Newportis — . 
А Е26 30 
(E30) (R30) The northing of 
(E27) (R27) Remembertolook E31 20 30 South South Head is 
Harbord from bottom totop. Head 
ThenorthingofHarbordis  . 
= 20 
(E31) (R31) The northing of £27 
(E28) (R28) OnMap2,thehor- E24 50 75 Manly Manly is " 
Maroubra izontal (or sideways) lines 


areusedtofindthe northing 
of some suburbs, 


The northing of Maroubra із . 


* Except for the specific example used. 


45 


30 
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(b) Examples of Far Transfer test items. 
MAP D (Smaller than actual size. ) 


Шыннан Sees қынына 4 


(je 


ШШ зан жанған өлен i 
ers мм жен оын 
mm ke kee EE ЕЕЕ 


| 
| 
2 


ж 


ж 


] 
! 
%--- 
Note how Position X is shown. Similarly, put in position R at 193189 on the map. 
size.) 


MAP E (Smaller than actual 


the missing numbers of the horizontal and vertical lines. 


Usin, 
E the position of p (315825), write on the тар 
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THE TEACHING ASSESSMENT BLANK: 


A FORM FOR THE STUDENT ASSESSMENT 


OF COLLEGE INSTRUCTORS' 


DAVID S. HOLMES 
University of Texas, Austin 


ABSTRACT 


Р а 
A form which students can use to assess their class experiences was presented. A factor analysis ene 
on evaluations filled out by 1,648 students revealed four factors which measured (a) the quality of the imer ү 
tors’ presentations, (b) the evaluation process апа the student-instructor interactions, (c) the degree to analy” 
the students were stimulated and motivated by the instructors, and (d) the clarity of the tests. А further 


sis indicated that subscale scores which reflected the factor scores could be developed from the total item p 


THERE IS a long and varied history to the sys- 
tematic evaluation of college instructors bytheir stu- 
dents. Because of the increasing external and inter- 
nal demands being placed on universities for better 
teaching, increasing student concern for the quality 
of instruction, and the fact that it is becoming in- 
creasingly difficult or impossible for administrators 
to visit and evaluate all classes, there is good rea- 
son to believe that in the future the evaluation of in- 
structors by students will be more widespread and 
relied on more heavily. 


The Teaching Assessment Blank ( TAB)? was de- 
signed and is being used by the University of Texas 
at Austin with three goals in mind: (a) to provide 
data for a publication to be usedby students interest- 
ed in obtaining information about prospective instruc- 
tors (2); (b) to provide the university administration 
with data upon whichto base, inpart; their evaluation 
of members of the teaching faculty; and (c) to pro- 
vide the faculty members with feedback in hopes of 
improving their teaching. It is important to note 
that the scale was not designedto evaluate an instruc- 
tor's academic competence or knowledge or the value 
of the course he teaches. Instead the items on the 
Scale are limited to those related to the instructor's 
teaching ability. The TAB was constructed with this 
limitation since it could be argued that many students 
are not in a position to evaluate an instructor's aca- 
demic competence or the value of his course. Onthe 
other hand, college students are in apositionto eval- 
uate the way in which the instructor presentshis ma- 
terial. Furthermore, students are in a better posi- 
tion than anyone else to report their own responses 
to the instructor's presentation, i.e., the interest, 
effort, and thought provoked in them by the instruc- 
tor. The ability to stimulate students would seem:to 
be an important aspect of teaching to evaluate, espe- 
cially if learning is considered to be an active pro- 
cess. The TAB is unique in that unlike previous 


ool 


М " ich di^ 
evaluation forms it has a number of items т 
rectly assess this aspect or result of teachi | 


ir^ 
In its present form ( Form II), the TABhaS с 
five statement-items (see Table 1), some О averat | 
elicit factual responses (“Му overall grade onse? 
is. . ."), while others elicit subjective ге5Р g r 
(“Тһе instructor seemed to be well-prepare шей 
lecture or discussion,") * Тһе items аге La ne^ 
on an answer sheet designed for electrical 0 y ous” 
chanical scoring. The TAB is filled out anon are 
ly at the end of the course. The instruction? 
t 
Please evaluate this instructor by indi s 
ing the one responsethat most nearly asi of 
the feeling you have had generally or MOF. 
the time concerning his teaching. ЖОШ, state 
uations will be of great value: (1) if YOn апр 
your own personal feeling without conni 
what the sentiment of others might Бе} sit to 
(2) if you realize that, while it is ВСР (on 
respond exactly to the kinds of statemen ле ner 
tained below, with care you can selec cil: 
sponse. Use a Number 2 or 2 1/2 PC” 24. 
Omit only those items which are not эру ларе" , 
ble, or which youdo not feel qualified n 


= 


The items from the TAB are present! 
1. On the basis of their content, these е grow д 
broken down into three types. The firs ‘pmation » 
tains six items which provide basic iic вех t 
the respondent: year in school (item 4), whe pef 
2), grade point average (GPA) (item %/; соғу | 
the course was іп his major area (деш taking 44 
membership (item 34), and reason Е jm? 
course (item 35). These were inclu s o Й 
so that persons considering the cour нагасіе әрі; 
lished evaluation could identify the С jt. тре 40 
of the students previously enrolled vated to P 
group contains twenty-four items re 


items, N y 
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TABLE 1 


ITEM STEM 
S FROM THE a 
MENT BLANK (FO oo ASSESS: 


13, 


ч, 


23, 
24, 
25. 
26, 


b. 


EN 


31. 


32. 


34. 
35 


1: 
2. 
3. 


= 


u, 
12; 


both in hi 
^ ДОМ in high school and college, t 


My classification is: 

Sex: 

ма grade in this course was, or probably 

My Overall grade average at U.T. is: 

: ue course is part of my major field: 

b anted to take this course before the semester 
egan: 

Ius instructor seemed to be well-prepared for 

Y ure or discussion: 

a used enough examples and illustrations to 

Не rify the materials for me: 

ei presented material in a coherent manner, 

d phasizing major points and making clear 
eir relationships: 

р е usually was aware of whether the class mem- 
ers were following his discussion or lecture 


with understanding: 
T usually held my attention during class: 
he instructor, through the course, challenged 
ES to be creative in my work: | 
е made me feel free to ask questions, disagree, 
К express my ideas: | 
е was fair and impartial їп his dealings with 
Students; 
He was intellectually stimulating (he caused me 
to think): 
He revealed enthusiasm for his teaching: 
е let us know what he expected of us on tests 
Te assignments: 
e meanings of questions on his test. 
Usually clear: 
6 usuall i d tests 
Promptly: returned assignments an 


s were 


= had sufficient evidence, in terms of class 
qualicipation and written work, to evaluate the 
тері of my performance іп this course: К 
гац і indi- 
Vidal: ing system was fair to me as an 
© was readily available for conference outside 
€ class: 
gone to be interested in students as per^ 
He in 
jo interested me in the subject of this course 
Ше ked forward to attending class: 
had Omparison with all the instructors I have 
stru both in high school and college, this in- 
m actor was: 

Comparison with all the courses I have had, А 
Ац his course was: 
со things considered, the text book usedin this 
mane was: Ы 

this an honest effort to learn in this course: 

е js, ürse I learned a great deal: | 
al op Struetor made clear to me his education- 
The eves in this course: | р- 
Jectiy Structor accomplished his educational ol 
: tried t In this course: 

Class; O meet with the instructor ОЧ 


he 
ls: College or school in which I am now enrolled 


tside of 


2 
kik 
this course to satisfy: 


*Response alternatives for items 6-25, 29-32: Def- 
initely Yes, Yes, No, Definitely No; for items 26 and 
27: One of the best, Above average, Average, Be- 
low average, Far below average; for item 28: Ex- 
cellent, Above average, Average, Below average, 
Poor; for item 33: Many times, A few times, Only 
once or twice, Never; for item 34: Arts and Sci- 
ences, Business Administration, Education, Engi- 
neering, Other; and for item 35: Major or minor 
field requirements, Other specific degree require- 
ments, Elective credits required for degree, Non- 
degree requirements (e. g., requirements for teach- 
er certification), No requirements at all. 


the instructor and course under consider- 
ation; that is, items which assess the instructor's 
presentation, examinations, etc. The items in this 
group will be discussed and classified further with 
regard to the factor analysis presented in a later 
part of this paper. The third group contains five 
items with miscellaneous content; two (items 22 and 
33) deal with the out of class contact with the instruc- 
tor, one (item 28) inquires about the quality of the 
textbook used in the course, one (item 5) asksabout 
the grade expected in the course, and one (item 6) 
assesses the degree to which the student wanted to 
take the course before enrolling. 


STUDY 1: FACTOR ANALYSIS OF CLASS 
RELEVANT ITEMS 


aspects of 


SUBJECTS 
During the spring semester of the 1967-68 aca- 
ers of the College of Arts 


demic year, faculty шеші» ) 
апа Ѕсіепсеѕ аї the University of Texas were invited 


to participate in the class-instructor evaluation pro- 
gram in which the TAB ( Form II) was the measur- 
ing instrument. Evaluations were carried out in 322 
classes. From these, seven large classes with a 
total of 1,648 students responding were selected on 
a roughly random basis to provide the data for the 
present analyses. Involved were three classes in 
one in zoology, and one in 


i , two in geology. 
peni The classes ranged in size from 106to 


een. there were 480 freshmen, 586 soph- 
omores, 394 juniors, 166 seniors, and twenty-two 
graduate students. The use of only large classes 
may limit somewhat the generalizability of the re- 
sults to be reported since there may be factors which 
influence the evaluation responses in small classes 
which may not be operating in large classes. Large 
classes were specifically chosen for study because 
the success of these classes 15 more dependent upon 
the quality of the instructor's presentations than it 
is in small classes and, since in the future this type 
of class will no doubt play a dominant role in the un- 
dergraduate’s university experience, it seemed im- 
portant to focus attention on this situation. Research 
to be reported subsequently will deal primarily with 


the small class. 


ANALYSIS 
i incipal axis analysis 
e factor analysis was a principal axi 
PS varimax rotation (3). The marme wee car- 
i -two of the twenty - four class- 
ried out on twenty’ ЕДІ, өк. 


uation items. Items 3 
p тет) and 27 (comparison with all 


courses) were not included since they involved 


36 
ТАВГЕ 2 


VARIMAX LOADINGS FOR EVALUATION 
RELEVANT ITEMS 


Items (arranged by largest FI ЕП ЕШ FIV 
factor loading) 


Factor I: INSTRUCTOR 
PRESENTATION 
7. instructor well prepared .80 .10 .08 .19 
16. revealed enthusiasm for 


teaching 77 .24 .15 .09 
9. presented material in 

coherent manner 416 405 :21 „93 
8. enough examples and 

illustrations -75 .06 .17 .34 

10. aware of class following 

discussion -58 .34 ‚25 .19 

Mean Loadings MS „16 AT 293 


Factor П: INTERACTION- EVALUATION 


19. returned assignments- 


tests promptly .02 .72 .07 .05 
13. feel free to ask questions, 
disagree .39 .63 .12 .10 
20. sufficient evidence to 
evaluate .01 .61 .14 .50 
21. grading system was fair 
to me .06 .56 .16 .56 
23. interested in students as 
persons 248 .52 .23 .15 
14. fair and impartial with 
students .32 .48 .18 .25 
Mean Loadings .21 .59 .15 .27 
Factor Ш: STUDENT STIMULATION 
30. I learned a great deal 20 4513 „77 221 
29. I made honest effort to 
learn 13.11 .74 .06 
25. Ilooked forward to 
attending class .52 .14 .63 .06 
24. he interested me in 
subject .58 .16 .59 .15 
15. stimulating. . . caused 
me to think .52 .21 .58 .13 
11. held my attention during 
class „56 .07 .55 .13 
12. challenged me to be 
creative .42 .30 .54 .41 
32. instructor accomplished 
objectives 40 ,17 252 „13 
31. instructor made clear 
his objectives .36 .22 .48 .38 
Mean Loadings .41 .17 .60 .19 
FactorIV: TEST CLARITY 
18. test questions clear .36 .08 .21 ."1 
17. let us know what was 
expected on tests .48 .10 .17 .63 
Mean Loadings .42 .09 .19 .67 


o —Á € 929 380 


Summaries rather 
characteristics. 
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Factor I 


Ап inspection of the content of the items showing 
highest loading on this factor clearly indicates that 
it constitutes an evaluation of the instructor's pre- 
sentation. The items measure, for example, the in- 
Structor's preparedness, enthusiasm, organization, 
use of examples, and awareness of students’ under- 
Standing. It is important to note that the items with 
highest loadings on Factor I showed relatively о 
loadings on the other three factors. The questiono’ 
the independence of the factors will be discussed in 
greater detail later. In view of the content of the 
items with high loadings on this factor it will be ге 
ferred to as the Instructor’s Presentation fa ctor. 
Factor I accounts for 23.97 percent of the variance. 


Factor II 


Factor II seems to be measuring two somewhat | 
different but highly related issues; first the insti? 
tor’s system for evaluating the performance of a 
students ( grading system fair, sufficient eviden iy 
to evaluate, returned assignments promptly ) whi Е 
the second was the instructor’s relations with Ek 
dents (interested in students as persons, free tight 
questions, fair, and impartial). At first these Ta 
seem to be quite separate but it could be sugges e 
that in classes with literally hundreds of students о 
assignments, examinations, and grades provide г. 
major medium of interaction with the instructor ы 
most students; for this reason the issues are n te! | 
That is, the student perceives the instructor 5 sona) 
tion to the evaluation process to be the only ре item? 
response possible. In view of the content of the | 
with the highest loadings on Factor II, this factor or. | 
be referred to as the Evaluation-Interaction ovat” | 
This factor accounted for 11.73 percent of the g the 
ance (third in order of magnitude) and, as W4 item? 
case with Instructor’s Presentation factor, о have | 
which have their highest loadings on this facto 
relatively low loadings on the other factors- 


Factor Ш 


The items with highest loadings on Factor Ш rer Д 
for the most part, those which measure 50 8 0С“ 
actions and behaviors which аге evoked by a up | 
tors. That 15, in contrast to the items makin ion | 
Factors I and II which directly assessed the gation 
of the instructor (his presentation and eV Facto, 
System), the items with highest loadings ОЛ ею pA 
Ш assessed what the students did in responsis 
instructor's behavior. For example, the io ins 
sured the degree to which as a function oft eff? е? f 


тегене of | 
haviOT 0s 
опар 
sessed by these items were response id ар 6 
to the actions of the instructor—and Meret 
items provide interesting additional, in its 
the instructor's presentation, specifical уезі 
Due to the content of the items loading de шШайОР pe 
this factor, it was called the Student Suen of | 
tor. This factor accounted for 16.82 р; | 
variance (second in order of magnitude )- 5 ый, 
It should be noted that there are two Мел are 0 
have highest loading on this factor but W 
consistent with the other seven items 


HOLMES 


кү These items deal with the clarity of course 
B jectives (item 31) and the attainment of the objec- 
E ves (item 32). The relation of these items to the 
actor in general is not clear, though it may be that 
clear goals and their attainment facilitates enthusi- 
asm. Ofthe items with a primary loading on this 
factor, items 31 and 32 have the lowest loadings. 


7 Factor Ш does not seem to be as independent as 
actors I and II; the items which make up Factor III 
qe tend to load on Factor I. While the items with 
d highest loadings on Factor I show a considerable 
erence in mean loadings on Factors I and III (.73 
тегаца . 17), the items with the highest loadings on 
раб Ш show considerably less difference in mean 
adings on Factors I and Ш (.41 versus .60). It 
даете that a high rating in terms of presentation 
9с Е preparedness, enthusiasm, coherence, 
whi ) can be relatively independent of the degree to 
Me Students are stimulated ( Factor III: interest, 
stimi , etc. ). On the other hand, if the students are 
fag ulated ( Factor Ш), the instructor is likely tobe 
thers as having given a good presentation. To use 
cali rical terms, an instructor may give a “Чесїшї- 
the d good performance but may not ‘‘project across 
Шу Cotlights.’’ To project across the footlights, 
a €ver, usually requires a good performance. It 
hel EE then that a technically good presentation is 
En ul, but not always sufficient, to motivate stu- 

S m in large lecture classes. While the present 
Dres did not directly indicate what in addition to the 
e tation caused students to become motivated, 
Seer; евгее to which students are stimulated. would 
чай, to be an important factor to measure in eval- 

à EAS instructors, especially if the goal of teaching 
Cess en as more than conveying information, a pro- 
= Which a textbook may do more effectively and 

expensively, 


Zetor ту 


Iv ad two items have highest loadings on Factor 
eem Arii content is quite consistent. Both items 
ests, tp be measuring the clarity of the instructor’s 

arit 27016, this factor will be called the Test 
loadin’, actor. These items have second highest 

tindin 85 on the Instructor’s Presentation factor, а 

ence 5 Which suggests that the preparedness, coher- 

Which pa awareness of students’ understandings 
9 be r ated the presentation of material tended 

tion, fected also in the preparation of examina- 


Varias, Factor IV accounts for 10.47 percent of the 


s 
in mos nat each subgroup measured a different, 
iy Cases, relatively independent aspect of 


Str 
өг Selec” was essentially replicated 
€ factor stor Of large classes, thus sugges 
Tucture is rather stable. 


ting that 


етуу 
Үш 
П: DEVELOPMENT OF SUBSCALES 


In 
Tan ап ef; 
dete ап Em t n simplify the interpretation of the 
о Mining DE Was made to arrive at a system 0 
Шацу s ummary scores for each of the four 
Scores Пед areas, thus reducing the num- 
Tom twenty-four to four. One approach 
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to this problem is to use factor scores iv 

a factor analysis like that described aoe I 
statistically elegant, this approach requires that a 
factor analysis be carried out eachtime factor scores 
are wanted and this is impractical or impossible for 
most users of the TAB. A second approach is touse 
the factor loadings from the above analysis to identi- 
fy items for membership in subscales from which 
sub-scores could be easily derived. That is, items 
with the highest loadings ona factor could be used to 
construct a subscale to represent that factor. Sub- 
scale scores would not be perfect representations of 
the factors since in the computation of subscale 
scores any item would contribute all of its variance 
to the one subscale to which it belonged; while in the 
computation of factor scores an item can simulta- 
neously contribute different amounts of variance to 
different factors. The resulting discrepancy may not 
be important, however. The degree to which the sub- 
Scales actually reflected the factor structure as re- 
vealed by the factor analysis can be determined by 
comparing the subscale scores withthe actual factor 
scores. Through the use of a multi-trait (four as- 
pects of teaching), multi-method (two types of 
scores)* matrix (1) it can be determined whether 
the scores of the subscales were similar to the fac- 
tor scores they were designed to represent ( сопуег- 
gent validity) and whether, as one would hope, these 
relationships were higher than those betweenthe var- 
ious subscale scores (discriminant validity ). 


Item selection 


А subscale was constructed to represent each fac- 
tor previously identified. The sole determinant of 
an item’s subscale membership was its loading on 
the factor the subscale was designed to measure and 
therefore subscale item membership was identical 
to the factor item membership outlined in Table 2. 


Multi-trait—multi-method matrix 

Weighted factor scores {ог Factors I, II, III, and 
IV were determined for each S using the data from 
the factor analysis presented earlier. Subscale 
scores for subscales one, two, three, and four were 
determined for each 8 by summing the responses 
given to the items within each subscale. The inter- 
correlations between these right scores are present- 
ed in Table 3 in the form of a multi-trait—multi- 


method matrix. 
i i alues in the validity 
Of primary interest are the v i 
envi (underlined in Table 3) and the values е 
hetero-trait—mono-method triangle for subsca e : 
Scores (enclosed by a solid line in the lower righto. 
Table 3). It is clear from the magnitude of the уа- 
lidity values (i.e., the correlations between the fac- 
tor scores and the subscale scores) that the ЖО 
г scores га! 
subscale scores reflected the factor sco E ere 


2 is, there seems to be a consi 
pec res ‘convergent validity." In fact, the mean 


.71, thus indicating ma on the ш 

ores account for 50 percent о 
nde а се in Е factor scores. As would be hoped, 
m variante it —mono-method values for the sub- 
Es pede (i.e., the correlations betweenthe var- 
m scale scores) are considerably lower than 
em aridity values. That is, there is some degree 
Le vcriminant validity." The mean 1s 151, и 
см That on the average the score оп опе scale 
in 
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TABLE 3 


TI-TRAIT—MULTI-METHOD MATRIX FOR 
МОРЕ. ASPECTS OF TEACHING, (1) INSTRUC- 
TOR'S PRESENTATION, (2) EVALUATION- 
INTERACTION, (3) STUDENT’S RESPONSE, (4) 
TEST CLARITY, MEASURED BY TWO SCORES* 


FACTOR SUBSCALE 


1 2 3 4 1 2 3 4 


FACTOR 


SUBSCALE 


eee 
*The validity values are underlined. Each hetero- 
trait—monomethod triangle is enclosed by a solid 
line. Each hetero-trait—hetero-method triangle is 
enclosed by a broken line. 


accounts for only 25 percent of the variance of the 
score on another scale. Ideally of course the hetero- 
trait—mono- method correlations could be lower but 
it should be noted that some degree of relationship 
would be expected between the subscales, especial- 
ly, for example, between subscales one and three 
which measure the quality of an instructor’s presen- 
tation and the degree to which students are stimulat- 
ed. (The correlation between these two subscales 
is actually as high as one of the validity correla- 
tions.) It does appear, however, that there is enough 
separation between the scales to warrant their use. 
The use of the subscale scores will provide the user 
with more specific summary information than has 
been available in the past from one global score. 


DISCUSSION 


In view of the preceding analyses, it is clearthat 
the TAB is a potentially very useful instrument for 
carrying out student evaluations of college instruc- 
tors. Itis evident from the factor analysis that the 
TAB measures a number of different but very rele- 
vant aspects of teaching. The existence of this 


multidimensionality and the fact that the various di- 
mensions can be easily scored makes it possible to 
obtain summary evaluations on a number of areas | 
within the broad topic of “teaching competence. i 
In addition to being helpful interms of providingfee! E 
back to the teacher, the specificity offered by the ех 
istence and identification of the subscales will ae 
be of value to the researcher who all too often in Ls 
past could only work with a global score which re^ 
flected the entire constellation of teaching compe? 
tence. Ы 
may enable the researcher to more accurately de 
scribe the influence of other variables on teaching 
or teacher evaluation. 


^ ы; = А китеп 
This refinement of the measuring instrum 


Of particular interest and potential value is the 


identification of the student stimulation factor. 


asm, and motivation seems to be animportant айр 
to measure if we see the classroom as only the 
ginning of education and our goal as the turning 0 
“рирїїз”” into "students. '' 


si- 
degree to which instructors instill interest, enthu: | 
| 


p 


FOOTNOTES 


ingthé- | 
The present research was carried out during fhe 
author’s tenure as Research Consultant jver^ | 
Measurement and Evaluation Center, U^ 
Sity of Texas, Austin. | 


ale 

The credit for the initial development of the sau! | 
discussed in this paper belongs to Юг. 
Kelly, Director of the Testing and EV 
Center, and Miss Caroline Dowell, Re 
Associate, University of Texas. 


ive obje? 
In addition to responding to the thirty-five оке 


tive items, the students are invited to m e 
free response comments on the back 0° од јл 
answer sheets. These are not conside" 

the present analysis. 


fe" 

It should be noted that in this analysis шев “з 
ent “methods” were not different шеші, шо 

collecting the data but rather differen | 

of determining the scores. 
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ABSTRACT 


rimentally the orientation of 165 fifth graders and 114 ninth 


This was an initial attempt to examine expe 
Little support was 


8raders to an individualized educational system. 


о; i 
ап would have an effect оп s 
- The following guidelines, base 

2 
4 ) employ а no-orientation control group; 
tive administer criterion measures г 
5; and ( 5 ) include relevant indices О 


DURING THE initial year ( 1967-68) of indi- 


vi ү 

кы education in Project PLAN ( 2) the ob- 

сене of teachers, administrators, and field 

ences eae suggested that students needed experi- 

educati esigned to orient them to this individualized 
ational system, In this framework, appropri- 


ate 
in adent, responses varied markedly from those 
ore conventional contexts. Extensive data, not 
uggested that the 


о 
et MNT collected, 5 h 
structi a student had experienced conventional in- 

е thar the more difficult it was for him to make 
indicat, sition to Project PLAN. Staff observations 
гейисе m that specific orientation activities might 

er Pelis number of students who appeared tofIoun" 
Schoo] e LA during the first months of the 


o emphasize 
ities in Project 
port or direc- 


tion 

ences, the development of such learning experi- 

First vas available. Four problems were involved. 

Systematine studies were student orien 

of varntically assessed before the imp 

data lik us orientation programs. subjective 

not avail those documented in Project PLAN were 

(8, a able in the literature. Most investigators 

Deeg, ssumed the existence of student orientation 
including those labeled academic- 


While this information seemed t 


the n 
e А 
ed for student orientation activ 


tudents’ academic perfor 

ture r ә а оп the interpretations 0 
esearch іп this area: (1) include pre- and post- mea: 

(3) control for 

elated to student orien 

Í teacher ability to im 


found for the hypothesis that the amount of 
and their opinions and knowledge of the 5у5- 
were suggested f or fu- 
eeds on each criterion; 
t modeling by students; 
nstructional objec- 


mance, 
f the results of this study, 


suresof student orientation n 
the effects of inter-treatmen 
tation needs and the orientation i 
plement orientation programs. 


intellectual, social and informational. 
nce orientation research was under- 


Secondly, si 
e college level, it was dif- 


taken predominantly at th 
ficult to generalize implications for the elementary 
and secondary students of interest in this study. 
Thirdly, too few studies assessed the effects of two 
or more orientation procedures in order to deter- 
mine the best means of meeting students needs. Pac- 
ided an example of one of these stud- 

Using random assignment of students and cri- 
teria including а college information test, an attitude 
sgood's Semantic Differential 
stration advisor's rating of 
each student's preparedness for the registration pro- 
cess, he foun e approach was more ef- 
fective in orienting students than were programmed 
workbook strategies. hat his 
results were not replicated. 

Finally, evidence indicating whether or not stu- 
dents’ needs were met by orientation programs was 
conflicting. The findings of several studies ( 4, 5, 
10,11, 12, 13, 14,15) suggested, on the basis of pos- 
itive student and staff reactions, that orientation was 
Yet the results of other similar studies 
using more varied criteria indicated that orientation 
may have been а waste of student andteach- 
er time ( 1,6,16) or might have even hindered 


кокк кин ee 
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student performance ( 3). 


Assuming the validity of the available subjective 
data on Project PLAN students’ orientation needs, 
the present study focused on the last three problems. 
Its Ss were fifth and ninth graders, not college stu- 
dents. It employed a design enabling a comparison 
of two orientation procedures and using more rigor- 
ous and extensive criteria than just student and staff 
reactions. The effects of the amount of orientation 
on student knowledge, opinions, and overt behavior 
including academic performance were examined. Cri- 
terion measures were developed to relate closely to 
Project PLAN student orientation objectives and this 
contrasted with many previous orientation studies in 
which instruments, apparently only indirectly related 
to the programs’ orientation objectives, were em- 
ployed. It was hypothesized that students who partic- 

ipated in a comprehensive orientation program would 
perform better academically, have more knowledge 
of the educational system, and have more favorable 
opinions about the system than would students who 
participated in a brief orientation program where the 
amount of information, student practice of relevant 
behaviors, and involvement provided was reduced to 
that deemed the essential minimum, 


METHOD 
Subjects 


The sample included 114 ninth-grade students 
from two schools and 165 fifth-grade students from 
four schools, All students had been selected at ran- 
dom for Project PLAN’s individualized education pro- 
gram, All schools were in the San Francisco Bay 
Area, 


Orientation Programs 


Student orientation materials consisted of book- 
lets containing orientation information and sugges- 
tions as to what students might do and use to achieve 
the instructional objectives of the orientation program, 


The orientation objectives were sequenced so 
that students proceeded from an introduction to the 
new education program, to learning the specific be- 
haviors needed to function in the system, and on 
through individual planning and scheduling of their 
work in each subject matter area. The variety of 
learning activities recommended included reading, 
teacher or student demonstration of instructional 
equipment and materials, group discussions, individ- 
ual planning sessions with teachers, and practice ses- 
sions for the performance of skills important in the 
individualized education system, 


Two versions of these orientation materials were 
prepared for each of two grade levels (i.e., fifthand 
ninth). One version constituted a comprehensive and 
the other a brief orientation program. The versions 
differed in two major respects, The amount of infor- 
mation and behavioral practice included in the brief 
version was substantially less than that included in 
the comprehensive version, For example, students 
in the brief orientation program had fewer opportuni- 
ties to practice operating audio-visual equipment and 
to participate in group discussion of various aspects 
of the educational system. The brief version also 


allowed each student less involvement in planning and 
decision making relative to his academic work, For 
example, the teachers of the students in the brief pro- 
gram specified the type and amount of work each stu- 
dent would try to do; while the teachers of the students 
in the comprehensive program met with each student 
to discuss the type and amount of work each student 
would try to achieve. The comprehensive orientation 
program consumed, on the average, about 5 days of 
student time, while students completed the brief pro- 
gram in 2 days or less. 


Criterion Instruments and Measures 


As noted earlier, no systematic, objective data 
on student orientation needs were available on which 
to base the development of this study's criterion in- 
struments. Reliance was placed upon previous obser- 
vations of students by school personnel and upon à 
study of the materials and procedures with which stu- 
dents in the system would have to cope if they wereto 
perform successíully. Therefore, the validity of in- 
struments used in this investigation is limited to face 
validity. 


Instrument development was based on the specií- 
ic instructional objectives of the orientation materials: 
and the outcomes of student knowledge, attitude, oF 
overt behaviors, which indicated the achievement of 
these objectives. Thus, all parts of each of the follow- А 
ing instruments were keyed to the orientation objectives: 


Orientation Module Tests Number 1 and Number 
2. To test Ss’ knowledge of the PLAN program hee 
mediately after they completed their orientation acl 
ities, two multiple-choice tests were written at mee 
grade level. These tests were prepared in the sam у 
format as the subject-matter tests students regular sms 
took in this individualized program, One test Dust 
for fifth graders and 48 items for ninth graders) 58 
pled Ss’ knowledge and performance relevant to HaT 
first two orientation objectives, while the other te 1 
( 10 items for fifth graders and 10 items for nin 
graders) focused on a second pair of orientation © nal 
jectives, Representative examples of an instruction 
objective, a related student outcome, and a test ite 
forfifth graders follow. 


Sample Instructional Objective: 
TEE instructional Objective: 


or 
To describe the responsibility and role f 
PLAN student, 
Representative related student outcome: г 


Know and be able to perform procedures T. 
working through teaching-learning units and for 
end-of-unit tests, 


Sample Test Item: 


———À M MÁS g—  ——————— {ih лхлхл 2,0, 


r 
If you receive an NP on a module test, yov 
next learning activity should be to 


A. stop working on Project PLAN for a while: 

B. goon to the next module. re 

C. take another TLU in the same module, ОГ 
peat the same TLU. 

D. tell your friends. 


0! 
» petent 
Orientation Survey Test. To check SS re 


ү 


ject; 
aud S is provided below along wit 
ent outcome and 


ыг 
boot forr 


asi 
WOuld pom the Statements so that to a 
os" ee, np nent a favorable opinion of P 
binio, With others would represent an и 


в jon 
Ponseg „27 the other) was requeste 
Зі 
еге 
ок the Op; 
entat;, ^l entation Survey Tests, 3 weeks 
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ae acquisition of knowledge, a 54-item 
aoe pe Shotce test was administered at the fifth- 
Simiane el 3 weeks after the orientation program. 
to praia a 65-item test was simultaneously given 
заи grade Ss. Both tests sampled the same ob- 
the О (s and related knowledge outcomes assessed by 
redi Wu). Module Tests, including knowledge 
E he PLAN system functions, terminology and 
SCHO res used, and responsibilities of students and 
ona Personnel. A sample question from the sec- 
ary level survey test follows. 


Sample Test Item: 


The two ** signs in front of objecti 
є objective 3186 mean 
that this student should 


А. know he did exceptionally well on that objective. 


B. know that this objective has a double number. 
+ review this objective and then meet with his 
Б їеасһег. 
- take the test for that objective again. 


Structured Interviews. Two weeks after orien- 


tati = = 
regret 20-minute interviews were administered by 
earch assistants using structured questionnaires. 


жесі interviews assessed the extent to which гап- 
у selected Ss in each treatment were able to ex- 


br 3 

maa their understandings of selected concepts and 

wer dures in PLAN and to perform behaviors which 

Som; necessary for Ss to function in PLAN class- 
s. 


ple of an instructional ob- 
h a representative 


an example test item. All in- 


Once again, an exam 


rvi 1 1 
lew items were structured to permit 8 
Scoring as either “ pass ” or “ fail * by the 


inte, 
"Viewers using detailed sample scoring keys. 


Sa 
mple Instructional Objective: 


aids ue find and to operate correctly all learning 
epres a PLAN classroom. 
raga related student outcome: 
Sam 1 у operate tape recorder and tapes. 
pis Interview Item: 
оп at the student to go to the tape recorder, put 
Pe, and begin playing it. 


о 1 
Pinion Surveys, As measures of students' ex- 


Preg 
S 
em ed opinions toward the PLAN educational ue 


14-item survey for fifth-grade Ss and a 
ed, The 


Y Phr Ў 
gree with some 
LAN while 


4 LAN. Degree of agreement or disagree- 
-Point scale ( to force a decision in one 
d. Available re- 


W , 
ну ae “© strongly agree,” “ slightly agree, , 
ЭВ testy Еге, » and ** strongly disagree. "The 

S were administered at the same time 25 
after the 


Sa 
mpi 
e 
Instructional Objective: 


ке То 

° d ; 

le oa Scribe the responsibility and 
PLAN student. 


traight-for- 


be SSible m for ninth graders were develop 
effects of response set were counterbalanced 


nfavorable 


буе ылыы related student outcome: 

Know an appreciate that a major deci ff 

in PLAN is the student. | orn 
Sample test item: 

The teacher makes allthe decisions ina PLAN 


class. 


2 Academic Performance Records. Constantly 
maintained within Project PLAN are computer rec- 
ords of students’ performance in their four subject 
matter areas-Language Arts, Mathematics, Social 
Studies, and Science, Students usually demonstrate 
their successful achievement of a group of instruc- 
tional objectives—called a ** module ’’—by taking a 
module test whenever they and their teachers agree 
that they are adequately prepared. Not all students 
work on the same modules; however, а11 modules 
are intended to take a roughly equivalent amount of 
instructional time for the average student for whom 
they are appropriate. A computer printout of allSs’ 
academic records was provided 10 weeks after the 
orientation activities, This enabled a frequency 
count of the number of learning units students had 
successfully completed during these 10 weeks. 


PROCEDURES 


Within each participating classroom, students 
were randomly assigned, regardless of sex, to either 
the comprehensive or the brief orientation program. 
It was impossible to implement an inactive control 
treatment because teachers refused to withhold all 
orientation assistance from some of their students. 
Such refusal might be ascribed to their observations 
of the deleterious effects of such a treatment during 
the previous year of Project PLAN. Similar nega- 
tive reactions were received regarding possible pre- 
treatment administration of criterion measures, 
Since there was a reluctance to greet students with 


a series of tests immediately upon their return to 
school in September, no pretest data were collected 
at that time. Therefore, this study’s design had to 
rest upon previously gathered information on PLAN 
students’ orientation needs, random assignment of 
Ss, and analysis of variance statistics on post-treat- 


ment data only. 
search assistants and printed 
idelines, each PLAN teacher administered the two 
treatment procedures without informing students that 
an experiment was being conducted. Because stu- 
dents in any PLAN classroom seldom work on the 
same materials, there were no problems in intro- 
ducing two sets of orientation materials. All crite- 
rion instruments except the structured interviews 
were administered to all students. One half of the 
Ss receiving each treatment within each classroom 
were randomly selected for interviewing because it 
was not economically feasible to interview all stu- 


dents. 


With the help of re 


RESULTS 
the dependent variables at each grade 
ment by sex by school) analysis 
al cell sizes was used to апа- 
Where the data were in the 
i "s test (17) was run 
of frequencies, Hartley’s i 
сни for homogeneity of variance. Where vari 
ances were significantly heterogen eous, log 


On each of 
level, a 3-way ( treat 
of variance with unequ 
lyze the available data. 
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TABLE 1 

MEAN SCORES FOR MALES AND FEMALES AT EACH 
SCHOOL AND IN EACH TYPE OF ORIENTATION PROGRAM 
FOR EACH OF THE NINE CRITERIA 


Comprehensive Orientation Program 
Female 


т m 8 pen Јаз л Бо 4 


mI 
лсо 


4.0 |3.5 |5.2 


"TT TER тө 
bs һг | [гв [izle ШІП 


mmm ee alee ааа 5 


Structured Interview 


18 Math Modules 


# Science Modules 


i 
|# Soc Stud Modules 


ly L_A Modules 


Brief Orientation Program 


Criteria 


Module test #1 | test #1 


odule test #2 


Survey test 


* Schools 1-4 represent the four intermediate-level schools: 
schools 5-6 represent the two secondary-level schools. 

** Blanks indicate no available data for that cell. Module test 
# 2 could not be administered at school 5. 
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TABLE 2 


OR 
SIGNIFICANT F-SCORES*** AND THEIR CORR Берово па 4 n EEDOM K 
THE MAIN AND INTERACTION EFFECTS FOR EACH OF 


INTERMEDIATE LEVEL (IL) AND SECONDARY LEVEL (SL) 


# of Science Modules 
Modules 


Survey test 
Opinion Survey 
Structured Interview 
# of Math Modules 
# of Social Studies 
# of Language Arts 


Sources of 
Variance 


Pu 


Treatment (T) 


* F-16.7* 
Fies df. 3/155 df. 
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transformations were made and Hartley’s test was 
run again. In all but one case (i.e., the number of 
Science modules completed at the intermediate level) 
variances were, either before or after transforma- 
tions, homogeneous, 


In Table 1 are the mean scores obtained onthe 
nine dependent variables by males and femalesfrom 
each school and in each type of orientation program, 
Table 2 shows the significant F-scores and their 
corresponding degrees of freedom for the mainand 
interaction effects at each grade level for each of 
the nine criteria, Underlined F-scores indicate that 
the data were transformed by log transformations, 
as variances were found to be heterogeneous by Hart- 
ley’s test. 


The effect due to amount of orientation was sig- 
nificant in only two of eighteen F-tests. The stu- 
dents in the comprehensive orientation program рег- 
formed significantly better on Module test number 
1 (at the secondary level—SL) and Module test num- 
ber 2 ( at the intermediate level—IL ) than did stu- 
dents in the brief orientation program, 


With regard to the effects of Ss’ sex, five of 
eighteen F-tests were significant. Female subjects 
performed significantly better on module Test No. 

1 (IL ) and had significantly more favorable opin- 
ions toward the PLAN program of individualized ed- 
ucation (IL ) than did male Ss, Male Ss performed 
significantly better during the interviews ( SL) and 
completed significantly more Science modules (SL) 
and Social Studies modules ( SL ) than did females, 


The effect due to school differences was signifi- 
cant in twelve of eighteen F-tests. There was sig- 
nificant variability among the schools on the follow- 
ing criteria ( school rankings are based on the in- 
formation in Table 1): Module Test Мо. 1(IL, School 
2737174; SL, School 576) , SurveyTest ( IL, School 
1727374); Opinion Survey (IL, School 37174-2);Struc- 
tured Interview (IL, School 3727174); number of Mod- 
ules completed in Mathematics (IL, School 371727 
74: SL, School 576), in Science ( IL, School 374 

7172: SL, School, 576), in Social Studies ( IL, School 
3747172), and in Language Arts (IL, School 37172 
74; SL, School 576). 


Of the seventy-two effects only four were sig- 
nificant, The effect of the Ss' sex varied with the 
particular school they attended ( i.e., on Module 
Test No. 1—SL and on the Survey Test-SL), with 
treatments on the Opinion Survey ( IL ) and withthe 
second-order interaction of treatments and schools 
on the Survey Test ( SL). 


DISCUSSION 


The results of this study provided little sup- 
porting evidence for the major hypothesis that stu- 
dents who participated in a comprehensive orienta- 
tion program would perform better academically, 
have more knowledge of the educational system, and 
have more favorable opinions about the system than 
students who participated in a brief orientation pro- 
gram. Students in the comprehensive orientation 
program actually performed better than students 
in the brief orientation program on only the two Mod- 
ule Tests ( No. 1 at the SL and No. 2 at the I L) 


which reflected students’ knowledge of the school 

system immediately after they completed their ori- 
entation activities, When comparing all students in 
the two programs, there were no differences on any 
other criteria. 


The performance of male Ss differed from that 
of female Ss on several criteria, Other than attrib- 
uting the results to possible maturation experiences, 
it is difficult to explain those findings which indicated 
that when males and females differed at the SL, males 
performed better; when they differed at the IL, fe- 
males performed better. When the effects of schools 
were significant, students at one school consistently 
performed better than all other students, and stu- 
dents at a second school performed worse than all 
other students on most of the criterion measures. 
Specific differences in teacher abilities (е, 5. , the 
ability to conduct a PLAN classroom) might have 
contributed to these rather consistent school effects: 
Few interaction effects resulted and these were never 
significant at both the intermediate and the secondary 
academic levels, 


The lack of evidence supporting the major LA 
pothesis may have been related to several of the fo" 
lowing factors. While there were extensive subjec' 
tive data on which to base a belief that PLAN stu 
dents had specific orientation needs, no objective 
measure of each student's needs was administere N 
as a pretest device. If the orientation needs of ве 
students were different in kind or intensity from tho 
anticipated, differential effects due to the amount d 
comprehensiveness of orientation perhaps should? 
have been expected. 


It is also possible that the amount of orientation, 
as depicted by the comprehensive program vaimoa 
than necessary. Perhaps the assumption that 4 tt 
more information ( or practice ) the better » ign 
true with respect to orienting students. The DC ie 
orientation program might be the one that provide 
each student with ‘ just enough to get going іп t B " 
System" and then allows the student to learn bY Р 
forming in the system. 


ef 

Since the students in the comprehensive and ts 
orientation programs were in the same classt 00 m 
students in the two groups could have learned fro che 
one another, therefore confounding the effects et 
treatments employed. Furthermore, the teach ге“ 
may not have performed as the research desis” (ou? 
quired. For example, they may have conducte a 
discussions in which material that was designa h all, 
for the comprehensive program was covered Жаға ro 
the students. Or, students in the comprehensi je 
gram may not have participated in all the activ’ «ch 
designated for their orientation program. Reol 
assistants attempted to monitor teacher imple" co? 
tation of the research design but were unable 
trol and correct all deviations. 


pe 
These results and implications suggest ап in? 
of guidelines for future research attempts to sents 
the effects of an orientation program for elem pne? ; 
and secondary students, First, the orientatio ctor 
of each student must be objectively ав568566 уо 
as well as after, the implementation of vare a 
grams or treatments, Secondly, in spite of tio” 
ministrative difficulty involved, a no -orient? 
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control group should be employed to see if orienta- 
tion of any kind is actually worthwhile. Thirdly 

the effects of student role modeling and student in- 
teraction in the same classroom need to be either 
manipulated or specifically taken into account, per- 
haps by using measures of teacher characteristics 
so that all the students in a classroom could parti- 
cipate inthe same program. Fourthly, criterion 
measures closely related to the orientation instruc- 
tional objectives ( and thus, to student orientation 
needs ) should be employed. Hopefully, the objec- 
tives and the instruments will focus on more than 
student opinion and attitude variables, Finally,since 
it appears that the teacher makes a difference in the 
effectiveness of orientation activities, indices mea- 
suring factors which contribute to this difference 
Should be incorporated into future research designs, 
(e.g., if how a teacher relates to and understands 
his students affects how his students learn, then this 
teacher quality or ability should be examined ). 


If subsequent research studies pursue these 
guidelines with more vigor than was possible in the 
present study, improved orientation programs should 
result, Only through these ways can questions such 
as the following be examined closely: ( 1 ) What stu- 
dent orientation needs exist for each student? (2) 
How can these needs best be met? (3 ) How canthe 
degree to which these needs are met be determined? 


This investigation was the first to examine ori- 
entation procedures in individualized educational 
Systems and was an integral part of current develop- 
mental work on a comprehensive career guidance 
System (7) for such contexts. Also, it extended 
experimental research on orientation programs to 
academic levels below those of higher education, and 
it made progress toward using criterion measures 
directly related to the orientation instructional ob- 
jectives and student needs. Subsequent research 
Should extend these efforts and should incorporate 
the research design improvements suggested by the 
results of this study. 

FOOTNOTES 
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administrators in the following California 
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the implementation of this study:Archdiocese 
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Santa Clara Unified School District. This 
study is part of a project being conducted 
pursuant to a contract with the Office of Ed- 
ucation, United States Department of Health, 
Education, and Welfare ( Contract No. OEG- 
0-070109-3530 ( 085 ), Research Project No. 
7-0109). 
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Research Program of the American Insti = 
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AGE, DEGREE OF TRAIN G, AND TYPE 


OF EXTRADIMENSIONAL SHIFT 


IN NORMALLY INTELLIGENT HUMANS 


MICHAEL D. LeBow? 
University of Manitoba 


ABSTRACT 


The experiment examine 
performance. i 


FOR MANY YEARS psychologi i 
the processes determining the шуш of ee 
crimination learning have used the transfer method- 
ology. Conflicting results have been obtained espe- 
cially with the extra-dimensional (ED) shift w 4354 
training trials beyond criterion (overtraining) have 
been manipulated; this transfer task requires that 
the relevant training dimension become irrelevant 
during transfer. Caul and Ludvigson (1), for exam- 
ple, found that overtraining facilitated ED transfer 
learning with adults while Furth and Youniss (3) 
using normally intelligent children, showed that over- 
training did not facilitate ED shift. Furthermore 
Heal (6) found that overtraining inhibited ED trans- 
fer in а sample of retardates. 


Several studies have raised the amount of origi- 
nal training beyond two values (i.e., c riterion- 
training and overtraining) and have found that the 
relationship between degree of training and ED-shift 
errors is not linear. For example, Iwahara and Su- 
gimura (7) administered an ED shift first to 4 and 
5 year olds and then to 7 and 8 year olds while vary- 
ing the amount of training between five and forty con- 
Secutive correct responses. Although the overall 
training effect was not significant, the authors found 
an inverted U shaped function with the ED-task be- 
coming easier when the training criterion was in- 
creased beyond ten consecutive correct responses. 
With adults (15-19) a similar curvilinear relation 
has been found but with a maximum point lower than 
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(15), Speculating as to why these and oth?’ 


vergent results have be 1 ош ns 
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бардар таныр experiments аге (а) the ED-cO ele” 
vant qe wherein cue values on the original: {0 
tranan ноп remain the same from training ојл 
се = апа (b) the ED-change paradigm У hime? 
Si ues of the originally relevant training © yg 
Оп are diffe i 9 {гай "m 

has been fon a nt іп transfer. When over ta 
procedure а to facilitate ED-shift, the ED- 
no facilitates generally been employed, №“ 
change ог 100 Or inhibition has been found, 
fünatel oma Variant of it has been used- қ con^ 
fbund Y, the age of the Ss employed has been for 
ed with these paradigmatic differences: use 
children up to the age of вагонау 
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Study has sy i 
degree a B stematically attempted to manipulate age. 
time, seve raining, and type of ED-transfer at one. 
which some a experiments have been reported in 
example Е these variables were separated. For 
ing does not UE 18 suggestive evidence that overtrain- 
ED-shifts in : esult in large facilitating effects with 
relevant di volving cue changes along the formerly 
his is елер is Ss over 15 years of age (5). 
tained with опъва to the usually large effects ob- 
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D-const raining does not strongly facilitate the 
ant task in children (7). Because the ef- 
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intelligence on ED-transfer is also confound- 
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cue values and reinforcement contingencies are al- 
tered (ies ED-change). That is, the change from 
training to transfer may be equally obvious to both 
criterion and overtrained Ss given the ED-change 
task. For the ED-constant, however, the change 
will be clearer for the overtrained. ! 

Н-5: Overtraining will not facilitate ED-constant 

6, 7 and 8). While the effects 


in younger SS (5 and 
of overtraining on ED-shift using children have not 
been systematically investigated, most of the stud- 
ies have shownno facilitation. Withthe ED-constant 
task, in particular, overtraining seems to result 
in only weak facilitation. Perhaps, with young 
children, the additional trials employed have not 
peed enough to enhance the discrimination of change 
between (29405 


METHOD 


Subjects and Design 


The Ss, 5еуе 
numbers to three different age groups. 


group of Ss ranged from 18 throug) 

(mean age 19 years 5 mon 

group was 103 excluding two persons W 

dropped from the study because their IQ's markedly 
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The main experimental design was a 3x3x2 fac- 
torial having four Ss per cell with ages of Ss (Sand 
6, 7 and 8, and 18-21), levels of training(5, 10, and 
10+40), and type of ED-transfer (ED-change and 
ED-constant) serving as the independent variables. 
The major dependent variables used in the analyses 
were trails and errors to criterion after transfer. 


Materials 
MAITIS 


The stimuli used in the experiment were geomet- 
ric designs varying on two dimensions, each with two 
levels: form (square and triangle or circle and 
cross); color (black and red). These stimuli were 
prepared on 2-inch by 2-inch, 35 mm slides, with 
each slide being a photograph of two different forms 
of two different colors (e.g., black square and red 
triangle) and were randomly presented with the con- 
straint that no identical slides appeared twice insuc- 
cession. Two slide projectors, one for training and 
one for transfer, were used to project the stimulion 
a translucent viewing screen located in front of the 
S. A 2-button response panel with transparent but- 
tons was placed between the S and the translucent 
projection screen. A candy dispenser was located 
at the S’s immediate right side, anda candy was 
dropped into a small paper container after every 
correct response. Stimulus presentation, switching 
from the first projector (training slides) to the sec- 
ond one (transfer slides), information feedback, and 
the dispensing of candy were controlled by a BRS solid 
state logic system. During the task, the Etabulated the 
data while sitting behind the experimental apparatus. 


Procedure 


Each 8 served individually and the instructions he 
received specified the details of the experiment as 
well as the response-reward contingency. All Ss 
were shown a sample training slide while the instruc- 
tions were being read. In addition to the regular in- 
structions, the 5 and 6 year olds were given a pre- 
training task to acquaint them further with the nature 
of the problem. Pilot data indicated that young chil- 
dren had difficulty understanding the regular instruc- 
tions, especially the part pertaining to response and 
feedback. The pre-training stimuli consisted of sev- 
eral 35 mm slides, each being a photograph of two 
people or animals. The Ss learned to choose the 
stimulus which depicted the E’s statement (e.g., 
“press the button underneath the mother” or ''press 
the button underneath the man" ). Five consecutive 
correct responses were sufficient toend pre-training 
and to begin the regular instructions, and after these 


were read, the task was begun. 


All the training tasks were form rele- 
lor irrelevant discriminations with either 

ie ar circle being the positive cue. For all Ss, 

BUS and black were the values of the irrelevant color 


dimension. 


Training. 


Transfer automatically began as soon 

6 criterion was reached. roa all m 
i a color relevant, form irrele- 

tr anster conse ce with red being the positive cue 
vant e the negative one. For half of the Ss, the 
and 5 25 relevant form training cues became ir- 
pr lane ш did not change in value (ED-constant). 
ios the remaining half, these cues did change to new 

о 


alues on the form dimension (ED-change). The 
valu 


as the trainin: 


transfer task continued until either ten consecutive 
correct responses were emitted or the eightieth trial 
was reached. At the end of the experiment, eachS = 
was allowed to keep his candies or cash them in for 

a penny each, whichever he preferred, anda brief 
interview concerning the S’s ability to describe the 
experiment was administered. 


During both training and transfer, the stimuli were 
projected on the translucent screen for an interval 
of time which was terminated by the S’s response. 
Each S was required to select one of the two stimuli 
presented by pushing one of the two transparent but- 
tons on the response panel. The S was instructed to 
push the button underneath the stimulus he thought 
was correct. Immediately after the Ss response, 
the stimuli were automatically removed from the 
screen and the button underneath the correct picture , . 
lit up for 5 seconds. Nine seconds following the S's 
response, a new pair of stimuli were presented. In 
addition, if the S was correct, candy was dispensed 
from the machine into the cardboard box. By an- 
swering the S’s questions before the task began, соп- 
versation between the E and each S was minimized 
during the experiment. 


RESULTS 


The dependent variables of primary interest were E 
trials and errors to criterion. Because these tWO 
measures were highly correlated (г = .95 and .97 for 
training and transfer, respectively), only the analy- 
ses of errors will be considered. 

Training 

The age by training by shift analysis of variance 
on number of errors to the fifth consecutive correct 
response is presented in Table 1. The analysis of 
variance revealed only an age effect, with the older 
Ss making fewer errors than the younger ones, (F 
[ 2, 54] = 3.8, р = .05, although the magnitude of 
this effect was not too large E? = .098). Age of the 
S was related to the ability to verbalize the correct 
stimulus-response (S-R) contingencies of training. 
From the interview administered at the end of the 
experiment, it was found that while twenty-four of 
the adults and seventeen of the 7 and 8 year olds could 
label the correct training cue, only two of the 5 an 
6 year olds could accomplish this task. Errors We"? 
counted only to the fifth consecutive correct respons 
to equate the different training levels. The probability 
of any S’s making anerror after the fifth consecutive 


т 


TABLE 1 


SUMMARY OF ANALYSIS OF VARIANCE oF 
ERRORS IN TRAINING 


Source df MS Е 

Age 2 162.87 3.8153* 
Training Level 2 114.67 2.6860 
Shift 1 5.01 1174 
AxT 4 65.04 1.5235 
AxS 2 42.35 .9919 
TES 2 7.06 ‚1652 E 
AxTxS 4 21.64 .6414 
Ss 54 42.69 

оың ЛГ өл -- 
*р<.05 
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TABLE 2 


SUMMARY OF VARIANCE OF ERR 
o 
TRANSFER кш 


Source df MS F 
Age 2 1377.06 11.6180** 
Training Level 2 111.06 .9365 
Shift 1 249.39 2.1040 
Age x Training 4 375.37 3.1669* 
Age x Shift 2 141.56 1.1942 
Training x Shift 2 309.56 2.6116 
Age x Training x Shift 4 215.03 1.8142 
Ss/groups 54 118.53 
Total 71 181.55 

*p=.05 
** p=. 001 


correct response was fairly low in all the conditions 
wherein the criterion of ten consecutive correct re- 
sponses was required(i, e. , only three ofthe 5 and 6 
year olds, three of the 7 and 8 year olds and one of 
the 18-21 year olds made errors after achieving five 


consecutive correct responses). 


Transfer 


The age by training by transfer analysis of vari- 
ance based upon errors made to the tenth consecu- 
tive correct response or the eightieth trial, which- 
ever occurred first, is presented in Table 2. This 
analysis revealed an age effect (F [ 2, 54] = 11.6, 
p~.001) with younger Ss making more errors in 
transfer than older Ss. The magnitude of the rela- 


tionship between age and errors in transfer was 
greater than in training (E? = .22). The youngest 
he correct transfer 


Ss also had difficulty naming t 
cue at the end of the experiment, i.e., only three of 
the 5 and 6 year old group could state the appropriate 
transfer cue while fourteen of the 7 and 8 year old 
Ss and all of the adults could verbalize the correct 
Solution. Furthermore, age and training interacted 
(F[ 4,54] = 3.2, р<.05). The geometric repre- 
Sentation of this interaction, presented in Figure 1, 
shows that while errors decreased for the 7 and 8 
year olds from criterion 5 to criterion 10 + 40, they 
increased for the 5 and 6 year olds. Despite the lack 
Of a significant F for an age by training by shift in- 
teraction, it might appear that there is a training by 
Transfer interaction for the 7 and 8 year old groups. 
m F, however, was not significant (F[2, 18] - 
“9, p> .05). All the other main effects and inter- 
Actions were not significant. In addition, it should 
св, 1964 that while all the adults were able to reach 
соон before the eightieth trial, several children 
fone not (i.e., seven of the 5 and 6 year olds and 
of the 7 and 8 year olds did not make ten con- 


Secut 
Utive correct responses in transfer ). 


hy 

~Xtther Analyses of Errors in Transfer 

Because hypotheses were made in advance of run- 
east significant differ- 
made even when the 
lysis of variance was 
alues are present- 


nin 
nce ( е experiment, specific 1 
Сее D) comparisons were 
hot 31 F from the transfer апа 


Signis 
е Enific 
d in ome (11). These LSD v 


TABLE 3 


SUMMARY OF LEAST SIGNIFICANT DIFFERENCES 
FOR THE TRANSFER CONDITIONS 


Hypothesis and Compar- Difference N LSD 


ison 


1. ED-change will be 
learned with fewer er- 
rors than ED-constant 
in the criterion trained 
groups 


ED-change (5)-ED- 


constant (5) 107.234 


-134* 12 
ED-change (10)-ED- 


constant (10) +38 12 107.234 


2. Overtraining will facil- 
itate ED-constant in adults 


ED-constant (10+40)-ED- 
constant (10) 18-21 years 


only -4 4 61.707 


ED-constant (10+40)-ED- 
constant (5) 18-21 years 


only 61.707 


+8 4 
3. The criterion ED-constant 

conditions will not be signif- 

icantly different in adults 


ED-constant (10)-ED-con- 


stant (5) 18-21 years only 61.707 


+12 4 


4, Overtraining will not facil- 
itate ED-change 


ED-change ( 10440)-ED- 


change (10) -33 12 107.234 


ED-change (10440 )-ED- 


change (5) 107.234 


495 12 


5. Overtraining will not facil- 
itate ED-constant in children 


ED-constant (10+40)-ED- 
constant (10) 7 and 8 years 


only 

ED-constant (10+40)-ED- 
constant (5) 7 and 8 years 
only 

ED-constant (10440)-ED- 
constant (10) 5 and 6 years 
only 


-29 4 61.707 


61.707 


83* 4 


+76* 4 61.707 


ED-constant нт лат 
а 6 years 
constant (5) 5 and 6 y е 4 ай 


- жынына LL 


жр<.05 


= t, ED-change, and Criterion Training. 
ED hypothesized, that for the criterion 


It was found, as 
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5 conditions, fewer errors were made in the ED- 
change than ED-constant task for all Ss (p< .05). 
This was not the finding, however, for the criterion 
10 conditions. While the data showed that more er- 
rors were committed in ED-change (10) than in ED- 
constant (10), this difference was not reliable. 


ED-constant, ED-change, and Overtraining. 
Overtraining did not facilitate the ED-constant task 
in adults. That is, when the ED-constant (10+40) 
condition was compared to both the ED-constant (10) 
and the ED-constant (5) conditions, for the adult 
group, no Significant differences were found. Fur- 
thermore, ED-constant (10) was not significantly 
different from ED-constant (5). In short, it ap- 
pears that the amount of original training had no ap- 
preciable effect on the number of errors made in the 
ED-constant task for the adult group. As hypothe- 
sized, overtraining did not significantly facilitate or 
inhibit ED-change performance for all the different 
age groups combined. 


ED-constant, ED-change and Overtraining in 
Children. It was hypothesized that overtraining would 
not facilitate the ED-constant task in children. It 
was found, however, that 7 and 8 year olds in the 
ED-constant overtraining condition made fewer er- 
rors than in both of the other ED-constant tasks with 
the difference between the ED-constant (5) and ED- 
constant (10+40) conditions proving reliable (р = 
.05). The opposite was found for the 5 and 6 year 
olds, with more errors being made inthe ED-constant 
overtraining transfer condition than the other two 


FIGURE 1. 
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LEVELS OF TRAINING 


transfer groups combined ( p =. 05 for both ) Inter- 
estingly, when both the ED-constant and ED-change 
transfer conditions were combined for the 5 and 6 
year olds, a positive linear trend was found in the 
number of errors made from the lowest to the high- 
est training level. The nonlinear portion of this 
trend was not significant. 


DISCUSSION 


While it was found that ED-change was learned ; 
with fewer errors than ED-constant shift for the cri- 
terion 5 conditions and that overtraining neither fa- 
cilitated nor inhibited ED-change learning, the re- 


maining major predictions made for this study were 
not confirmed. 


Degree of Training and ED- Change 


It was predicted that changing the cues of the pre- 
viously relevant training dimension in transfer woul 
facilitate learning in the criterion trained groups in 
comparison to no change at all. Furthermore, it 
was predicted that this facilitation from changing the 
cue would mask any facilitation from overtraining; 
that is, the switch from training to transfer would 
be equally obvious for both the criterion and over- 
trained Ss given the ED-change task. It was found, 
however, that only with the lowest training leve 
(criterion 5) was ED-change easier than ED- 
constant. With the criterion 10 conditions, 5 
constant was easier than ED-change although this 
difference was not reliable. The difference between 
the criterion 10 conditions is mostly within the 5 an 
6 year old age group. The 7 and 8 year old Ss per- 
formed in the predicted direction showing fewer er 
rors in ED-change than in ED-constant criterion 10» 
It is not clear why 5 апа 6 year olds emitted more t 
errors in the ED-change task thanin the ED-constan 
task under the criterion 10 condition. 


The effects of overtraining on ED-change were 
consistent with the prediction. Other experiments 
have shown that a high degree of original training 
either does not facilitate or mildly interferes Wi 
the ED-change task in children (18). This was for 
for both groups of children in the present study V 
the ED-change overtraining condition was compar 
to the lowest training criterion (5); however, 009” 
the 7 and 8 year old Ss found ED-change overtrain 
ing more difficult than the ED-change criterion b 
condition. It is interesting that the performance ijy 
these Ss in ED-change and ED-constant is notice’, 
different. Increasing the training trials for 7 a^ 


мү] 
year olds seems to facilitate ED-constant but inhi of . 


ED-change performance. While the performance 
5 and 6 year old Ss under ED-change was differen 
from their performance under ED-constant, in геї 
of these transfer conditions overtraining was pgs 
than the lowest training level. 


For the adult group the overtraining ED-chanf 
task was the easiest discrimination, although п oth 
much different from the criterion 5 condition. ang? 
the form and magnitude of the curves for ED-Ch rfo" 
and ED-constant, which depict errors in transferat- 
the different training levels, were similar, indi 


м ci 
ing that, for these Ss, cue change was not à cru 
variable. 


d 
While no definite statement can be offer? 


f 
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explaini 

eee a thase results with children, it does seem 

Кш the cues along the previously relevant 
а & dimension іп transfer affects the rfor- 

mance of these Ss. к 


Degree of Training and ED-Constant Transfer in 
Adults 


Contrary to the third prediction, overtraining did 
not facilitate ED-constant transfer in adults. Evi- 
dently, a large amount of original training did not 
significantly affect these Ss in ED-constant shift. 
Perhaps the ED-constant discrimination was suffi- 
ciently easy for adult Ss, so that the facilitating ef- 
fects of overtraining would not be readily apparent. 
Low levels of training, in this case, would be enough 
to ensure rapid solution in transfer. In short, each 
of the training levels was stringent enough to alert 
Ss that errors, in the initial phase of the shift, were 
indicative of a change in procedure rather than mis- 
takes of the E or themselves. At the very least, it 
can be stated that large amounts of training did not 
increase the ease of discrimination of a change in 
procedure for these Ss. In addition, all of the adult 
Ss were able to verbalize the correct stimuli for train- 
ing and transfer. Therefore, it seems reasonable 
to conclude that the adults trained to criteria of five 
and ten consecutive correct responses were highly 
trained, and adding the extra training trials did not 
enhance the learning of the ED-constant task. 


e training and transfer discrimina- 


ted to the S's ability to label the 
Kendler and 


The ease of th 
tions may be rela 
stimulus cues and S-R contingencies. 
Kendler (9) suggest that training in à discrimination 
situation establishes mediating responses to the di- 
mensional aspect of the physical stimulus. The ad- 
dition of response-produced cues facilitates instru- 
mental response learning. For humans, these 
implicit response-produced cues may be verbal la- 
bels. Ina simple 2-dimensional discrimination, 
adults can label the stimulus and S-R contingencies 
after only a few correct trials. Perhaps, well de- 
veloped labeling abilities enhance the discrimination 
of change in reinforcement contingencies. This dis- 
crimination of change may function to reduce perse- 
verative errors to previously relevant training cues 
and/or the previously relevant training dimension 
leading to a more rapid transfer solution. In other 
words, by more quickly recognizing incorrect re- 
Sponses as indicants of a change in task rather than 
haphazard mistakes of the E or themselves, Ss will 


learn the ED-constant task with fewer errors. Since 
did not allow an adequate test 


the present experiment 
of perseverative responses in transfer, the hypoth- 
esized relationship between discrimination of change 
and perseverative errors is not derivable from the 
results of this study. 

fer in 


Degree of Training and ED-Constant Trans 
Children 
For the 7 and 8 year old 58, overtraining was 
found to have a facilitating effect on ED-cons tant 
learning. As the degree of original training in- 
de in ED-constant transfer 
““Creased with the ED-constant overtraining condi- 
ither of the criterion trained 


By employing the discriminable change hypothesis, 


the present results found for the 7 and 8 year old Ss 
in the ED-constant task can be explained. Simply 
stated, the effects of large amounts of originaltrain- 
ing might have been to enhance the clarity of the 
switch in reinforcement contingencies leading to a 
more rapid solution of the ED-constant task. If the 
training and ED-constant discriminations were at an 
optimal level of difficulty for these Ss, suchthat they 
could attach labels to the stimuli and the correct S-R 
contingencies, then additional training trials (by en- 
hancing the discrimination of change between the two 
tasks) would have a facilitative effect. Indeed, as 
the training criteria were raised, these Ss made pro- 
gressively fewer errors in ED-constant transfer. 
Another somewhat related explanation is the 2- 
factor theory of discrimination learning (7, 8, 14; 
15, 16). According to this approach, discrimination 
learning involves S-R connections in the early stages 
and the acquisition of general discrimination sets in 
the later stages. Discrimination sets, in contrast 
to learning sets, can be acquired in one task. The 
specific or single unit S-R connections lead to nega- 
tive effects in ED-transfer, while the discrimination 


sets have positive effects. 


wahara and Sugimura (8), the ac- 
mination sets is correlated with in- 
degree of original training. 
the discrimination set may 


According to I 
quisition of discri 
telligence of the S and the 
With low training criteria, 
not be acquired. The fact that a shift may become 
more difficult with an increase in training up toa cer- 
tain point is accounted for by the negative transfer 
effects of the specific S-R connections acquired in 
training. Conversely, the decreasing tendency in er- 
rors during shift with an increase in training trials 
beyond that certain point is due to the facilitative ef- 
fects of discrimination sets also acquired in train- 
ing. While these authors have not explained how а 
discrimination set facilitates ED-constant learning, 
or for that matter what constitutes а discrimination 
set, it may well be a labeling process. That is, as 
the training trials are increased, Ss may become 
progressively better at naming the stimuli and cor- 
rect S-R contingencies resulting in a better discrim- 

hen transfer occurs. Thus, a dis- 
be what Kendler and Kendler 
(9) call verbal labels. According to these authors, 
verbal lal d be fairly well devel- 
oped in hu To repeat, ef- 


fective labeling ma 
ination of change from trainin, 


labeling process is by ac 1 
trials and шау lead to a reduction in perseverative 
errors of both kinds (е. 8- previously relevant cue 

f the discrimination em- 


and dimension ). However, i 
ployed becomes too difficult and Ss are unable to ef- 


fectively label, or the Ss are deficient in labeling 
abilities, then additional training trials may have ad- 


verse effects. 
In contrast to the results found for the older chil- 
dren, 5 and 6 year old Ss made significantly more 
errors in ED-constant transfer when overtrained 
than when required to emit either five or ten consec- 
training. The differ- 


utive correct responses during trai 
riterion trained ED-constant 


ence between the two c 

conditions WaS slight. The fact that overtraining in- 
hibited ED-constant transfer in 5 and 6 year olds and 
facilitated it for 7 and 8 year olds may also be re- 
Jated to the ability to label. From the interviews 


52 THE JOURNAL OF EXPERIMENTAL EDUCATION 


given at the end of the experiment, it was found that 
only about 10 percent ofthe 5- and 6-year-old Ss were 
able to name the correct training and transfer stim- 
uli, while almost 65 percent of the 7 and 8 year old 
group could perform this task. Kendler and Kendler 
(10) have suggested that children below the age of 6 
are deficient in the ability to form verbal mediating 
responses. If being able to name the stimuli and 
contingencies was difficult for these Ss, then over- 
training would not be expected to facilitate ED- 
constant learning. Furthermore, if, in fact, these 
Ss learned the problem ina rote manner (е. g., based 
upon single-unit S-R principles) then overtraining 
might, as suggested, retard ED-constant learning. 
Iwahara and Sugimura (8) found that overtraining 
feebleminded adolescents resulted in a retardation 
of ED-constant shift learning as compared to low 
levels of training. The ED-constant shift was facil- 
itated after overtraining for the more intelligent Ss. 
The authors concluded that the feebleminded Ss 
learned the problem more mechanistically than con- 
ceptually; partial reinforcement of the previously 
relevant training cue in ED-constant transfer might 
enhance perseverative errors to a greater extent, if 
the S learned the training problem in a rote manner. 
As suggested previously, learning a discrimination 
problem mechanistically may involve the absence of, 
or a reduction in, the effectiveness of being able to 
label the stimuli and S-R contingencies. This would 
function to reduce the discrimination of change and 
enhance the negative effects in ED-constant transfer 
found after large amounts of training. That is, with 
poor discrimination of change resulting from impov- 
erished labeling abilities perhaps many perseverative 
errors would be committed in ED-constant transfer. 
In this situation, possibly because of the greater num- 
ber of extinction trials needed and the exacerbating 
effects of partial reinforcement of previously rele- 
vant training responses, more errors would be made 
after overtraining than criterion training. The fee- 
bleminded Ss of Iwahara and Sugimura's (8) study as 
well as the 5 and 6 year old Ss in the present experi- 
ment may have both been unable to effectively label 
the stimuli and thus performed more poorly in ED- 
constant after overtaining. In sum, it has been sug- 
gested that a large amount of original training would 
facilitate ED-constant learning if Ss had fairly well 
developed labeling abilities and inhibit this transfer 
task if Ss were deficient in this ability. Further- 
more, if the training and transfer tasks were easy 
enough for sufficient labeling to take place after low 
levels of training, then the facilitating effects of over- 
training in ED-constant transfer would be negligible. 


CONCLUSIONS 


While most of the main predictions of this study 
were not confirmed, several conclusions are war- 
rented: (a) age of the S appears to be a crucial 
variable affecting shift; (b) for adults, degree of 
training will not be a crucial variable in ED-constant 
learning if the transfer discriminations are too sim- 
ple; (c) degree of training may have different effects 
in older, in contrast to younger, children given the 
ED-constant task. It was suggested that these dif- 
ferences may be related to the ability to label the 
stimuli and S-R contingencies; and (d) overtraining 
did not significantly facilitate or inhibit ED-change 
performance. However, it seems that under atleast 
some training levels, ED-change and ED-constant 
have different effects as a function of age. 


An investigation of the effects of labeling stimuli 
and S-R contingencies in training and ED-constant 
transfer after different amounts of training would be 
useful. In this context, it would also be importantto 
investigate the relationship of labeling, discrimina- 
tion of change between training and transfer, and the 
commission of perseverative errors to the previous- 
ly relevant training cue and dimension. 


It seems reasonable that labeling and utilizing the 
necessary information in a discrimination problem 
is, in part, related to the difficulty of the problem 
(e.g., number of dimensions, kinds of stimuli used, 
and type of shift). It also appears reasonable that 
partial training for 5 and 6 year old Ss may be suffi- 
cient for 7 and 8 year olds to learn the task, and n 
be overtraining for adults, if the same discrimination 
were employed. Equating for task complexity acros" 4 
the different age groups might result in similar per 
formance under each training level for easy tasks: 
where labeling is not difficult and gross differences | 
appear with the more complex discriminations. us 
ever, to delineate the relationship between task CO 
plexity and the ability to label, as it influences the М 
discrimination of change between training and 
constant transfer, requires further research. 


FOOTNOTES 


1. The preparation of this research was partia a 
supported by National Research Counci 
ada 311-1665-12. The study is based on a t 
toral dissertation submitted to the GIU 
School at the University of Utah. The advi tion 
and encouragement of David Dodd, dissert? 
adviser, is gratefully acknowledged. 
Requests for reprints should be sent to Micha | 
D. LeBow, who is now at the University ni- 
Manitoba, Department of Psychology, way 
peg, Canada. 
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EMPIRICAL EVIDENCE ON THE | 


APPLICATION OF LORD’S SAMPLING TECHNIQUE 


[ 


ж 


ТО LIKERT ITEMS' 
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ABSTRACT 


This study was conducted to evaluate empiri i 
я 1 рігісаПу if means and standard deviations could be estimated ас” 
ашы by ee technique for an attitude scale which consisted of Likert items. The procedures followed were 
ose described by Lord (2). The formulas used were those reported by Plumlee (3), except that a minor mod- 


mean using item sampling procedures based on 5 percent item samples (3 items in each sample) was less devia 


from the population mean than fifteen of the twenty examinee sample estimates (5% examinee sample). А 10рег” 


cent aon жезді шүн of the population mean was less in error than nine of the ten corresponding examinee 
sample estimates and a 20 percent item sample estimate was more accurate than all five of the estimates fron 


the 20 percent examinee samples. A 5 percent item sample estimate of the population standard deviation w25 


more accurate than twelve of the twenty examinee sample estimates. A 10 percent item sample estimate was 1? 
less error than all ten of the examinee sample estimates and an estimate from the 20 percent item sample WaS 
more accurate than four of the five examinee sample estimates. Usable estimates of the population mean an 


LORD'S ( 2 ) technique for the development 
of test norms by using item samplingprocedures has 
been empirically checked for tests that are scored 
by number right. Lord (2), Plumlee (3), and Cook 
and Stufflebeam (1) as well as others have shown 
evidence that item sampling is as effective as exam- 
inee sampling, if not more so, in test norming. The 
advantage in item sampling for each student to spend 
only a few minutes to answer a few items instead of 
many minutes to answer an entire test has appeal in 
school districts where new instruments are being 


used or developed. 


This study was conducted to evaluate empirically 
if means and standard deviations could be estimated 
accurately by Lord's technique usingan attitude scale 
which consisted of Likert items. Each item was 
scored on a 4-point continuum: 4 points for strongly 
agree, 3 points for agree, 2 points for disagree, and 
1 point for strongly disagree. The need forthe study 
came from the use of an attitude scale in a school 
district to evaluate an exemplary program. The item 
sampling technique was more feasible than tradition- 
al approaches i 
standard deviation. 
found except ones in м 
number right. 


n estimating the population mean and 
ion but no previous studies could he 
hich the tests were scored by 


a 
ification was made in the formula for estimating the population standard deviation. An estimate of the population | 


METHODS AND DATA SOURCES 


/ 
The procedures followed were those described by | 
Lord (2) and the formulas used were those report? 
by Plumlee (3) except that a modification was made, 7 
in the formula for estimating the population standa” 
deviation in order to adapt it to Likert items. 
available Likert type attitude scale in the area 0 
dent satisfaction with school was utilized. This 5 
consisted of sixty items and was given to a populaU 
of six hundred fifth and sixth grade students. The 
Scale was not timed and approximately 1 hour W 5 е. 
allotted for students to respond to the items on the З | 


stu” 
f „айе 
то? 


Item sample and examinee sample estimates OF 
the population mean and standard deviation were E 
from item and examinee samples of 5, 10, and ^. 
percent. The examinee samples were drawnat ee 
dom without replacement and consisted of twe” 
random samples (30 examinees each) for the 5 pe 
cent samples, ten random samples (60 examinees 
each) for the 10 percent samples, and five rand? 
samples (120 examinees each) for the 20 percen 


t 
Samples. In a similar manner, the item at 
a 
1 


0 


were formed by random sampling without гер 
ment and each item sample consisted of three 
for the 5 percent samples, six items for th 


ite 
е 


= 
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percent samples, and ty i ч 
Gin saioles: р twelve items for the 20 per TABLE 1 
Means, standard deviations, and item variances SO ee HOO сазы ve 
were computed for each examinee and item sample. MATED BY ITEM-SAMPLING poo еа 
The means, standard deviations, and item variances SAMPLING METHODS ON A 60 A ae 
for the item samples were used to compute an esti- TYPE ATTITUDE SCALE ИШЕМ ЕНШЕ 
mate of the population mean and an estimate of the ————————————=— 
population standard deviation. The formulas pre- 
sented by Plumlee (3) were used except that the Standard 
item variance sum term (Хра) of the standard devi- Method and Sample N Mean Deviation 
ation formula was converted to its algebraic equiva- 
lent for Likert items scored on a scale from 1 to 4, Population 600 175.45 25.47 
Ex?, in which m stands for the number of examinees 
— taking a subtest. 5 Percent Examinee 
ml Sample Estimates 
A 30 169.77 31.21 
These data were obtained from students attending B 30 178.60 22.62 
elementary school adjacent to the Bloomington cam- c 30 180.30 22.40 
pus of Indiana University, a community consisting D 30 172.83 30.79 
of an above average number of professionally o ri- E 30 176.27 23.76 
ented families. E 30 170.80 28.39 
G 30 174.50 22.94 
RESULTS AND CONCLUSIONS H 30 171.80 27.18 
I 30 176.80 25.25 
The estimate of the population mean usingthe item J 30 174.70 27.50 
sampling procedure based on the 5percentitem sam- K 30 177.10 24.83 
ples (3 items in each sample) was less deviate from L 30 176.50 19.14 
the population mean than fifteen of the twenty exami- M 30 179.30 25.53 
nee sample estimates (30 examinees in each sam- N 30 176.50 24.15 
ple). The 10 percent item sample estimate of the о 30 178.93 22.70 
population mean was less in error than nine of the P 30 178.63 20.25 
ten examinee sample estimates. The 20 percent Q 30 168.03 24.77 
item sample estimate of the population mean was R 30 169.37 27.74 
more accurate than all five of the estimates from the 5 30 179.10 23.97 
20 percent examinee samples. T 30 179.17 24.64 
5 Percent Item 
The relative accuracy of the estimate of the pop- Sample Estimate 
ulation standard deviation from the item samples (3 items each) 
was similar to the estimates of the population mean A-T 30/sample 174.17 23.94 
when examinee sample estimates were used for com- | 
parison. For the 5 percent sample estimate, the 10 Percent Examinee 
item sample estimate of the population standard de- Sample Estimates 
viation was more accurate than thirteen of the twen- A 60 176.08 25.38 
ty examinee sample estimates. The 10 percentitem B 60 172.95 28,29 
sample estimate was in less error than all ten of the С 60 177.85 21.70 
examinee sample estimates, and the estimate from D 60 180.17 24-1: 
{һе 20 percent item sample was more accurate than E 60 m ou 
four of the five examinee sample estimates. Е 90 үе 2420 
‚я H 60 170.93 23.80 
EDUCATIONAL IMPORTANCE OF THE STUD ; "m aces 28.41 
J 60 178.42 24.95 


For the data obtained from this 60-item Likert 
type attitude scale, the estimation of the mean by 
item sampling was i 10 Percent Item 
Sample Estimate 


not deviating more than 1.3 raw 86016 Jn E e z 

a population mean of 175.45 (see Table 1). Com- items eac! 

pared to examinee sample estimates, the цеп et A-J 60/sample 174.42 25.39 

ple estimates seemed to have an advantage. e ии 

item sample estimate of the population standard de- pde ei 

Viation of 25.47 was in error 1.53 raw score points amp. A 120 171.58 34:01 

for the 5 percent item sample estimate, 0.08 points A 120 HM 05 

for the 10 percent estimate, and 0.55 points for the = 120 174.39 29:04 
0 percent estimate. Based on these data, the stan- b 120 115.91 2613 

120 116.16 25.98 


Bard deviation was fairly well estimated; however, Е 
О information was obtained on the shape of the dis- 20 Percent Item 
tuations and estimates of norm tables were not 20 sie Estimate 
: (12 items each) i57 " 

= вашр1е 
t If it ig more feasible in some schools for each A-E р 
"dent to spend only a few minutes responding to à 


175.48 26.02 
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few items instead of a much ee time UH an кч REFERENCES 
i en item sampling procedures offer | ын 
тае e pe af the population mean 1. Cook, Desmond L.; Stufflebeam, Бае -Es 
and standard deviation seem to be obtained. How- timating Test Norms From Variable е, dc 
ever, further verification of the procedure for Likert and Examinee Samples, '' Educational T Y 
type attitude Scales needs to be made as this study chological Measurement, 27:601-610, 1 . 
was completed оп а single, 60-item scale in which | seat 
each item was scored on a range from 1 to 4 points. 2. Lord, Frederic M., ‘‘Estimating Norms by E 
Sampling," Educational and Psychological Mea- 
surement, 22:259-267, 1962. | 
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BOOK REVIEWS 
= Robert Е. Clasen 


ші 


book review editor | 


NEW DIMENSIONS IN HIGHER EDUCATION 1 


on for a 6-week term of summer school at New York University: 
He is now and has been for a number of 


University of Durham and has, in recent 
toward higher education in Great Britain, 


years an instructor and professor at the Venerable Bede College of th 
t 
in the United States. 


years, been principal of that college. Naturally, the book is oriented 
but the author is apparently very well informed about higher educa 


There are two new dimensions discussed in the book. The first is the greater scale on which full-time 
higher education is likely to be conducted within the next generation. 
three, or even one in two, in the immediate 
This compares with one in twelve in 1962, 


By the end of the century, perhaps опе e 
post-high-school age group will be participating in higher educati 


іп , The second new dimension is that opened up by educational techn? 
ogy, with the sharper awareness it is forcing on us of what higher education is and what it might be. 

The discussions are obviously for liberal arts colleges in hi 
er education, and the author commits himself to the fundamental 
general education rather than specialized education. In Collier’ 


e 
gher education rather than other types of high 
premise that a liberal arts education should 1 


ж 5 opinion, the type of subj ich may We 
be offered for general education should be governed by five principles: ' n кырны ные 


1. Contemporary civilization is inescapably dependent on science and technology and the modern 
citizen needs to appreciate the positive use both of scientific methods and of technological develop- 
ment in the normal life of the country. 


2. Modern society is changing more rapidly than any previous human society. World population 
is increasing at phenomenal speed; communities of all kinds, whether urban or rural, industrial OT 
underdeveloped, are growing in scale and changing in style in unforeseeable ways. : 


3. We live in a society where there is no longer any clear consensus among leading citizens a5 
to the values that should rule our lives. The older generation find it difficult to help young adults 
to form their values. The speed of social change makes adaptiveness and flexibility more impor- 
tant than in the past. 


4. There must be some continuing study of language and its assumptions, and practice іп its use: 


Collier, K.G. (New York: Humanities Press, Inc., 1968), 164 pp. 
THE AUTHOR has had most of his teaching experience in departments and schools of education in appre?” 
tice colleges. He has taught courses in educati 

5, There must be provision for synthesis, since students will be following not only a specialist | 
course, but non-specialist courses on ‘‘science and society,” “social change, " and **values."' This 
Seminars conducted by tutors 


is perhaps best done through syndicate (small group) assignments and 
who are familiar with the general courses and are concerned to attempt some synthesis. 


1 
А А е cO 
Throughout the book, the author discusses various types of teaching methods which satisfy four basic ва? 


ditions for good teaching: (1) Intense activity of the mind of the learner in a continual process of сотрагіто 


1 
И 
б) 
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- PROGRAMMED TUTORING OF DECODING SKILLS 


WITH THIRD AND FIFTH GRADE NON-READERS 


ELLIS RICHARDSON and LUCY COLLIER 
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ABSTRACT 


decoding skills б sound-symbol correspondence, visual analysis, and blending ) was stu- 
scored below average on a battery of psychomotor tests. A group of twelve ** no treat- 
shown to be super ior to the experimental Ss in reading simple sight words опа laborato- 
S required an average of 4 1/2 hours of tutorial time, distr ibuted across 
gram content. Posttest results showed the experimental Ss to be supe- 
ding and demonstrated that experimental Ss could apply decoding 
clusion drawn is that so-called dyslexic children can learn basic 


the highly-structured, programmed approach, 


The acquisition of 
died with twelve Ss who 
ment” control Ss were 
ry pretest. Each experimental 
forty-three sessions, in learning the pro 
rior to the controls on all measures of deco 
Skills to unfamiliar content, The major con 
reading skills, The success Was attributed to 
dies (2:103, 107) that a “ code emphasis '" method, 

f reading disabilities emphasizes consistent training in decoding 

e approaches to the cor- printed language for spoken language, in contrastto 

Studies implicate emo- a “ meaning emphasis ” method, produces better re- 
sults at the beginning reading stage. 


arding 


has been published regi 
13 ); one that 


ALTHOUGH MUCH 
the nature and cause О 
little is known about effectiv 
rection of these problems. 


How maladjustment ( 11:74), cultural A. n 
18), and neurological impairment ( 13:57-71). Re- . . . , 
Medial approaches range from physical therapy (4), The decoding skills outlined by recent investi- 
akt and Fernalt tracing gators (7, 10, 14) provide the behaviors which allow 
a child to determine the word-sound corresponding 


ge ofits sound 


multi-sensory stimulation (V: 4 
епѕогу modality 

to a word-image through a knowled: 

kill requires that 


techniques) ( 15:43), and specific 5 
( 15:43-44), to 
The usual look-say 5 


training in deficit perceptual areas ^ © ii оа НЕШЕ 
varied techniques of teaching reading изеп. a child learn а separate sound association for each 
Pr е in identifyin the para- word configuration. The decoding approach requires 
аба В pas been Or renting alse a the read- that the reader first analyze а word into its parts E 
iness stage (5), yet no effective program has been a visual basis (e.g, tore trigram man, one о 
documented which provides the remedial help for three divisions are possible: m, a, п; m m ma,n ) 
Children who need it early in school. In 1968, one- sounding out the word components in left- o-right 
third of New York City’s public school children be- order with the knowledge of single letter and bigram 
tween the second and ninth grades were reading one sound associations. Then, he must be able to de- 
Year behind the national norm; and one-fourth were rive the oral blend for the whole word from the com- 
reading 2 years behind the national norm, as mea- ponent parts. The analytical reader then has atool 
ment Test for decoding new letter combinations made up of fa- 
Research has indicated that the aver- 


Sured b i ing Achiever 
y the Metropolitan Reading е dom aili Ux 
age child cannot induce letter -sound and blend-sound 


(MRAT) (12). But the standardized соге = 
ur ildren severe oui ‹ 
тоате гапе еб), and it d cain for relationships from look-say training without explic- 
ов children ensi responsive to remedial interven” it training in decoding skills (14:30). 
t i i ithi ional in- 
ан no help is available within the education Chall, Roswell, and Blumenthal (3) have shown 
И that auditory blending ability is positively corre- 
lated with reading achievement on Several different 


It is clear from a recent survey of reading stu- 
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measures (silentreading, oral reading, and phonic 
ability). Silberman (14) has explored this relation- 
ship. He investigated techniques for teaching analyt- 
ical decoding skills with consonant-vowel-consonant 
(CVC) trigrams and found that the key to training the 
generalized skill for reading novel trigrams was aud- 
itory blending practice. First grade children who 
were not trained to respond with a whole word-sound 
(e.g. , man) to the auditory presentation of the pho- 
neticized elements (е. g. ,/m/,/an/) were not able to 
apply their analytical skills to novel combinations of 
trigram elements. His Ss who had been trained in 
the auditory blend were able to decode 75 percent of 
the novel trigrams tested, while Ss who did not re- 
ceive auditory training in an earlier version of the 
program were unable to read any of the novel trigrams, 


Gotkin, McSweeney, and Richardson (7) have 
recently developed a similar program for kindergar- 
ten children, They found that kindergarten children 
could generalize to one-third of the seven novel tri- 
grams tested after completion of the program, They 
found it necessary to place even heavier emphasis on 
the auditory blending training than did Silberman (14), 
The Gotkin study generated evidencet hat further 
training in the programmed routines would increase 
the percentage of novel words a child could read, 


Silberman and Gotkin demonstrated that decod- 
ing skills could be taught to first grade and kinder- 
garten populations with automated individualized les- 
sons, The project described here represents an 
attempt to study in detail the acquisition of essential 
decoding skills by third and fifth grade nonreaders 
who were not benefiting adequately from remedial 
help available in the school, The program extended 
and revised the Gotkin work and utilized a program- 
med tutorial technique similar in many aspects to tech- 
niques developed by Ellson and others (6), 


DESCRIPTION OF THE PROGRAM 


The most important behavioral objectives of the 
program may be summarized in the following model 
( 10:28): 


1, Child looks ай“ pom.” 

2, Child perceives “ р’ and “© от” as separate 
units. 

3. Child says sounds in order, ** /p/,/om/,” 

4, Child listens to the sounds he has verbalized, 

5, Child produces sound for the whole image, 

“ рот,” 


In the first three steps, the child must have the 
sound-symbol associations for the images ** p” and 
“от”, he must be able to perceive them as order- 
ed units of the whole image, ‘‘ рот,” and he must be 
able to produce their sounds in order. These steps 
are based on the visual modality. In steps 4 and 5, 
the child must be able to blend his own verbalization 
of two discrete sounds into a single composite sound, 
These steps are based on the auditory modality, The 
objectives, then, define a word-attack behavior which, 
in its generalized form, can be used by the child to 
unlock the word-sound of any regular bigram or tri- 
gram, Other behavioral objectives include a small 
sight-word vocabulary, capital letters, 
labeling punctuation marks, and proper in- 
flection in oral reading. 


This program is organized into three cycles or 
sets of lessons with seventeen lessons in Cycle Iand 
ten lessons each in Cycles П and Ш, Each lesson is 
preceded by about 1 minute of sound training in which 
the child practices saying the new sounds or wordsto 
be introduced in the lesson. For lessons specially de- 
signed to teach the blending skill, each lesson is pre- 
ceded with practice on the auditory blend. In theau- 
ditory blending procedure, the E says the component 
Sounds (e.g., m,op)and the S responds with the 
blended sound (е. g., mop). 


Cycle I teaches the objective behaviors for two 
bigrams (om and op) and three trigrams ( тот, рор, 
and mop). Table 1 shows the series of Steps usedto 
attain this objective. Cycle I begins by teaching three 
animal sounds to establish the behavior of saying а 
sound to a graphic image (Level 1 in Table 1). Inthe Y 
next few Cycle I lessons, the child is taught to say the 
phonic sounds of three letters (m,o, and p)in re- 
sponse to their graphic images ( Level 2). Next the 
child is taught the bigrams om and op on a look-say 
basis ( i.e., no reference is made to the component 
letter sounds in Level 3). At Level 4, he is taught to 
say the component sounds in order, followed by à 
blended response of both sounds together, as in the 
previously described model. Level 5 teaches a look- 
Say response to the three trigrams, At Level 6, the , 
child is taught to look at the trigrams ( тот, рор, a? 
mop), say the sounds of the first and last two letters; 
and follow this with the whole-word response. Note 
that at this point the child can produce the model wor 
attack behavior in the presence of two particular bi- 
grams and three particular trigrams, but he is not 
yet expected to generalize this behavior to novel con 
binations, a task which requires true blending. 


Cycle II adds three more single letter -sounds e 
(s,a,t), two more bigrams (at, ot), and three mod 
trigrams (sat, pot, pat). Cycle II omits the 1006-58 
responses represented at Levels 3 and 5 in Cycle b 
using a previously established blending behavior tO 
teach responses to the new combinations. Final Ws 
Cycle II adds one new Skill, word order, to the chil 
skill repertory, 


| Cycle III teaches three new single sounds no 
i,f) and exploits the blending behaviors to teach WI 
new bigrams and six new trigrams, Further, Cyc t 
Ш extends the skills to include four irregular 516 tio 
words, four punctuation marks, appropriate infle¢ 
in oral reading behavior, and capital letters. 

After having completed Cycle Ш, it was ov 
that the Ss would be able to respond appr OP 
ately to all of the content presented in the program" 
More important than the programmed content, ho to 
ever, it was expected that the child would be able a^ 
Seneralize the programmed skills to correctly ? ged 
lyze and blend novel bigrams and trigrams comP? 

of familiar letters 


pected 


SUBJECTS AND PROCEDURE 

Initially, Ss were selected from a list of twenty. 
one children submitted by the school's remedial: 
ing teacher. АП of the children on this list уеге 
a laboratory test designed to determine knowled8? ; 
any letter sounds, phonic bigrams or trigram? ye 
simple sight words, Five Ss were selected ОП 


|! TABLE 1 
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CYCLE LEVEL CONTENT EXAMPLE 
Visual Auditory Expected 
| Stimulus Stimulus Response 
1 duck-quack Cow Say the sound... Quack 
cow -moo Picture quack 
dog -woof 
2 fm, 0, p Say the sound... /m/ /m/ 
m ji + 
3 om, op Say the sound... /op/ /op/ 
4 gm op Say the sound of the first and 
last letters.../o/, /m/ /o/ ,/m/ 
Say the sound of both together 
... /om/ /om/ 
5 mom, pop, mop рор Read the word... pop pop 
6 mom, рор, тор |тор Say the sounds of the first and =a 
last two letters.../m/,/op/_ /m/, /op/ 
Now, read the word... mop | mop 
1 [s,a,t [s Say the sound. . ./s/ 787 
2 at, ot at Say the sounds of thefirst and 
last letters... /ае/ /t/ /ae/,/t/ 
Say the sound of bothtogether. . . Т Tat. 
3 sat, pot, pat pot Say the sounds of the first and last two 
letters. ../p/,/ot/ /p/,/ot/ 
1 Now read the word, . . pot pot 
| 4 (word order mom, sat Read the words,... 
skill) Y mom, sat 
I nif Т Say the sound. . . /1/ П 
2 in, on, am, Say the sounds of the first and 
an, it in last letters. . . /1/, /п/ | /у/,// 
Now say the sound of both ; 
together... /їп/ /in/ 
3 fat, mat, Say the sounds of the first 
man, fan, fan and last two letters... / 
not, sit /1/ , /ал/ | Hf, fan) _. 
Now read the word. . . fan fan 
4 (sight words) I, |е man Readthe words... the man 2 
is, a, the ШЕННЕ РЕ: 
і is this 
5 (names of punctu- (E points to comma) What is " a” 
ation marks)per- | sit, man, sit mark called?... It’s a comma. comm: 
iod, comma,ques-| ————— € 
tion mark 
6 oral reading T Т) (S reads words n 
skills ) I am a man 2) (E: Good, now read it like 
this,” (E reads words with proper 
inflection). Я 
3) 5 reads words following 
| E's model. 
1 я T 
(capital letters ) Aj 
- M,O,P,S,A,T, |F Say the sound... /f/ 
| N, LF 
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basis of near-zero scores on the test. While testing 
was in progress, several teachers suggested other 
children needing remedial help in reading. Seven 
more Ss were selected from testing results making 
a total of twelve, seven third-graders and five fifth- 


grade hold-over students, accepted into the experi- 
mental program. 


After screening, all Ss were tested on the Wep- 
man Auditory Discrimination Test, Birch Perceptu- 
al-Motor Sequencing Test, Birch Audio-Visual Tap- 
ping Patterns Test, and the Bender-Gestalt. 


A control group of twelve was selected from chil- 
dren who received the laboratory screening test but 
were not included in the program because they dem- 
onstrated a limited sight-word reading vocabulary. 


All work with the remedial Ss was conducted in 
daily sessions ( when attendance and schedules а1- 
lowed) lasting from 3-5 minutes in the early parts 
of the sequence and from 7-15 minutes in the later 
parts. In a daily Session, the S was seated in the 
laboratory beside the tutor, The tutor read the les- 
Son or test script Showing the appropriate visual 
stimuli ( see examples in Table 1) either on 3 inch 
by 5 inch index cards or in the pages of a notebook. 
In some cases, it was necessary to deviate fromthe 
lesson script to allow for Special problems such as 
speech difficulties. While sacrificing some of the 
control necessary for definitive research, this pro- 
cedure allowed the development of techniques ds 
dealing with these special problems, 


Each session was timed w 
the tutor noted on the S* 
used in the session. A 


If an S made more than two errors on i 
or had special difficulty in learning a к вне 
the tutor repeated the lesson that taught the weak el- 
ement, If the S did not master the item With this 
additional practice, a reinforcing activity inthe form 
of a reading game or drill was giv: 


en in the next ses- 
sion or in several subsequent sessions, 


At critical points in the sequence ( e, g, 
point between the single-sounds lessons and 
grams lessons), pre- and posttests were given tode- 
termine mastery of previous lessons and to assess 
knowledge of the next few lessons. These results were 
used to recycle Ss in lessons not sufficiently master- 
ed and to skip over lessons for which the S already 
knew the content. This procedure insured 100 per- 
cent mastery of previously covered materials апа 
also avoided unnecessary time Spent in lessons Which 
the S did not need. Each 5 continued to receive les- 

s and tests until 100 percent mastery was attain- 
son the final Cycle III posttest. Two Ss failed to 
ed e te Cycle III: one due to an inordinate record 
ous ү from school and the classroom, andthe 
i to particularly severe memory deficit and 
attention span problems. 


‚аа 
the bi- 


ially de- 
he end of the school year, a specia 
i Fuerte test was administered to all 55. The 
Fic designed to assess mastery and retention of 


the programmed content as well as the generalization 


ет 
of the programmed skills to novel content, TRES 
el content included both шел ыша a 
(е. g. , mat, top) and abstract combinations ws 
fam). All combinations were tested on à О КӨШ 
basis where the child was shown the word an 
ply instructed to read it, as wellas on à dy 
basis where the child was instructed to 551 
sounds in the word before being үсе Жүл 
it. This dimension was included in the = apply 
Sess the degree to which the children i / 
the blending model in decoding the words. 


RESULTS AND DISCUSSION 


whi ve phonic skills. owever, the contr 
ch involve phon kills. H t 


performed significant. baren D= 001 level 
ams and trigrams adel 

25 ctively). These results S ІЫ the controlgr 
ре videnced phonic 5 ry which was n 

ther group evide d sight vocabulary 5 this fa 

maed днн rimental group. g Ма тері in 

pe eset ore cs the controls from tre 

or whic 


the program. 


Table 3 presents the mee the Bender -Ge 
achieved by the remedial Sante, to the n 
Birch Perceptual-Motor ##Н compared г 
Auditory Discrimination Те" Gestalt and DIL 
mal population. The Ben identify deficits 
Sequence were designe on of visua rgani 

lon. oor in natic of OF Be 
йш eordination Bympto" чайогу Бізге 
bi injury (1,9). The МЕК having dition Ej 
i т 4 identifies children ces in auditory 
E e an пес нав and differen 
g sim қ 
quencing (17). ars below the 


ell 3 e scoring 9 


f 
The experimental grouP 


$ Кор 
average 10-year-old child on the 
TABLE 2 - SCREENIN 
T 
PERCENTAGE CORRECT IN PHON 
PRETEST Results 
e 1 
trol -whitne 
Experimental сасар 6 Mann WI 
Group % 1 
4 
Single % ug-25 
Sounds 29.6 NS 
x=9 26.9 ee ў Р 
Meaningful у = 25 dá 
Bigrams 53. 0 p= . 
%-11 24,2 r 
Meaningful U = 20 
Telerama 41.5 P= .00 
теш 14,2 : 
U = 48 
8.3 NS 
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TABLE 3 


ШЕЛІ AGE DIFFERENCES FROM NORM IN 
урива OF REMEDIAL Ss ON 
SYCHOMOTOR TESTS 


9 
о = o o о 
bos S 
255 2.3 2 fa 
а с a 
5520 zo SAEZ 
Bender -Gestalt 10 7 -3 
Mean Koppitz Age 
` (№ = 12) 
Wepman Auditory 
Discrimination 10 below 
5 -5 


(м-11) 


Analysis Items 
5 (N=11) 0 11.5 41.5 
© . 
= „Synthesis Items 
4 (N= 11) 10 7 -3 
So 
2E Drawing Items below 
өз, (N=11) 10 5.5 -4,5 
Ф Ф 
ао " 
А Drawing and 
5 Matching Items below 
‘a (N= 11) 10 5.5 -4.5 


The mean age-level achieve- 
ntal group on the BirchSequence 
10 g from -5 years on the 
rawing and matching items іо + 1.5 years on the 
analysis items. One $ did not take the BirchSequence. 
The test results indicate that some visual perceptual 
and integrative deficits were present in the experi- 


mental group. 


evel achievement on the Wepman 


The mean age-l 
han a 5 year difference. (One 


demonstrated more t 
luded due to invalidation of his 


remedial population al 
Spanish-speaking homes, 
tangle the effects of oral 
ferences. 

Performance of the experimental group is 
i 1 Mean Time per Cycle, 


shown in Table 4, с 
ӯ Mean Number of Sessions per Cycle, and Mean Time 


ie Session. On the average, each с hild receive 
i in a total of about 41/2 


about forty-three sess 
hours of instructional time during the course of the 
experiment. Since 
taught in Cycles I and П, 
© children were able to mas 
a little more than hi 
кама Cycle I content 
о Шев in Cycle Ш). 4 
Е as much content as the first [ 
ҚЫ several new skills, it 15 not possibl 
е these aquisition measures to Cycles тапа 11. 


termi Table 5 indicates mastery ^ v: 
as d by the criterion tests administered 
{18 Program, АП Ss achieved 100 percent 


© 
Yeles Т and II and in single soun 


i is success.) 


d and bigram anal- 
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ysis acquisition in Cycle Ш. In Cycle III, 94 per- 
cent mastery Was obtained for bigrams blending and 
92 percent for trigram analysis. Capitals sight- 
words, and punctuation mark labeling and function 
were mastered at a 92 percent level. All twelveSs 
completed Cycles I and IL However, since two Ss 
did not complete Cycle III, their scores reflected 
less than 100 percent mastery of Cycle III content. 
Their scores, nevertheless, were included in tne 
data in Table 5 and account for the less than perfect 
mastery obtained for Cycle III. 


The results of the end-of-the-year test for pro- 
grammed content and skill generalization are shown 
hese results reflect the mean percent- 


age of correct responses to different words tested 


regar 
redas a 
both. The resu 
groups were сотр: i 

Test. This comparison S i 
group performed significantly better than the control 


group on all sections of the test. 


Of particular importance is the significantly bet- 
ter performance of the experimental Ss on the ‘‘not 
programmed "' content. Recall that the screening 
pretest indicated the controls could read significant- 
ly more bigrams andtrigrams. However, Table 5 
results indicate that after the experimental Ss had 
been trained in word-attack skills, they could read 
more trigrams than the Control Ss. Theseresults 
justify the conclusion that the Ss learned generalized 
word-attack skills taught in the experimental pro- 
gram. Е 

The degree to which the programmed word-at- 
i tal Ss in decoding the 


look-say respon 
call that all content was 


sponse following 
(blended). Table 7 shows 2 comparison of the look- 
say and blended responses poth for the experimenta 
and control groups using а Wilcoxon matched-pairs 
test. 
TABLE 4 
EXPERIMENTAL GROUP PERFORMANCE IN 
Cycle Cycle Cycle 
I Il ш 
меапТїте 
in Cycle 
( Minutes) 92 54 133 
Mean 
Number of 
Sessions 17 10 16 
Mean 
Time Per 
Session gå єй 8.3 
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TABLE 5 


PERCENT OF CONTENT MASTERED BY EXPERIMENTAL GROUP AS DETERMINED ON PROGRAM 
CRITERION TESTS 


Cycle I Cycle II Cycle Ш 
Pre Post Pre Post Pre Post 
% % % % % % 
Single Sounds 24,8 10! 35.8 100 27.7 100 
Analysis 88 100 92 100 58 100 
Bigrams 
Blend - 100 33 100 46 94 
Analysis 58 100 92 100 92 92 
Trigrams 
Blend 17 100 36 100 31 100 
Word Order - - 78 100 - - 
Capitals - - - - 15 92 
Sight-Words = Е = - 44 92 
Punctuation - - - - 25 92 
The experimental group performed slightly better terferred with their ability to read the words. 


on the look-say response for the programmed content, 
However, this difference did not approach signifi- 

cance. The controls, on the other hand, performed 

significantly better (p ^ .01) on these words whena 

simple look-say response was required. These re- 

sults indicate that asking these Ss to perform a word 

analysis, a task they had not been trained to do, in- 


The effect of the word-attack training becomes 
apparent when the ‘ not programmed” content is con- 
Sidered. The experimental group performed singifi- 
cantly better on both meaningful and nonsense words 
(p = .01 and p < .005) when an analysis of the word 
was required before the reading response. The 


TABLE 6 


PERCENTAGE CORRECT IN POSTTEST 


———————— LLL 


Test Experimental Control Mann-Whitney 
Content % U 
Single Sounds ору аана 
( Programmed) 5,а,%, 90.7 41.6 U=7 
nif P-.001 
Programmed om, at, 
on , an; 93.3 53.3 U-15.5 
it pP-.001 
Bigrams 
ЕР Not Programmed ap, am, 
af, im, 13.3 16.7 0 = 17,5 
ір Р = ,001 
Programmed mom, pop, 
mop, sat, 
pat, pot, 
fat, mat, 91.3 54.9 U-14 
man, fan, Р = .001 
not, sit 
Trigrams 
Not Programmed Be 
Meaningful mat, top, 
leaning! top, fit, 12.2 40.3 0-34 
ріп, sam P-.05 
Nonsense pom, mot, 
вор, pon, 41.2 13.9 U-31 


fam, nit P-.01 | 
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TABLE 7 


PERCENTAGE OF SELECTED TRIGRAMS 
WORD ANALYSIS AND BLENDING APPROACH 
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READ CORRECTLY UTILIZING A LOOK-SAY APPROACH AND A 


Look-Say Response % Blend: 
led R % i 
Programmed Meaningful к хз 
e Trigra 
E PREIS 88.2 82.6 x жәр 
z А = 8 
© Meaningful 
5 Тіреп gf 4.4 69.4 2d 1 
E = 
5 N=8 
Not In Nonsense 18,1 Жы 
арта . 45.8 Т-і 
N=10 
Р = ‚005 
Programmed Meaningful 
Trigrams 54.2 30.6 » 3 io 
E Meaningful 37.4 23.6 TR. 
8 Trigrams ‘s 
3 NS 
Not In N 
o 
Program Nonsense 5.6 11,1 Test 


i a ee 


difference is greater when the content is nonsensical 
than when it is meaningful. The experimental Ss read 
15 percent more of the words correctly with the 
blended response when the content was meaningful 
and about 30 percent more when the content was поп- 
Sensical, 


CONCLUSIONS 


isi It should be reemphasized, before stating con- 
Studien tha the purpose of this experiment was to 

d i in detail the acquisition of decoding skills in 
we exics, Since the program involved only an aver- 
Mas of 4 1/2 instructional hours per child, and since 
wae a limited set of content and skills was taught, it 
ена лое expected that this treatment would have much 
ae on general reading ability as measured by such 
there as the MRAT, The results of this study, 
Probl Ore, are in no sense offered as a solution tothe 

em of reading failure, 


P eo cac the meaning of dyslexia is unclear and 
Were desir, study to study, there is no doubt that we 
Sion Mei with dyslexic children. This conclu- 
ation supported both by their severe reading retar- 
tive n by the evidence of perceptual and integra- 
that Fs шаң The experimental results clearly show 
01 corr. exic children can learn abstract sound-sym- 
more thbondsnoos as well as sight-words, Further- 
given ap e study provides conclusive evidence that, 
dren esis Opriate training procedures, dyslexic chil- 
eralized level, to analyze and blend words on a gen- 
Ocedures eui No previous studies have documented 
With pery. Which would teach these skills to children 
sive neurological involvement. 
ing tho alb and others (3) raised questions regard- 
ing skins E of training on improving auditory blend- 
have on ge nd the effect such improvements might 
neral reading ability, The results repre- 


sented here prove that appropriate exercises сап im- 
prove auditory blending skills and suggest an effect 
on general reading ability through the generalized 
blending performance of our Ss, The results areof 
importance in view of the finding that the auditory 
blending deficit is a singificant factor in readingdis- 
ability for neurologically impaired children (8). 


We believe that the fundamental key to our suc- 
cess in teaching, analyzing, and blending skills to 
dyslexic children was the carefully controlled con- 
tent and skill sequences and monitoring techniques 
representing an extension of several years of re- 
search in the area of decoding (7,14). The controls 
necessary to finalize this conclusion, in which the 
skill and content sequences are not systematically 
arranged and which use no monitoring techniques, 
are unfortunately lacking. However, the fact that 
both experimental and control Ss had experienced 
3-5 years of loosely controlled and unmonitored read- 
ing instruction prior to this experiment lends some 


support to this conclusion, 


A detailed look at the acquisition records (Ses- 
sions, Time, and Criterion Test Score) provides 
evidence for two interesting observations. First, 
the finding that Cycle II was mastered in about one- 
half of the time required for Cycle I lends further 
support to the conclusion of Gotkin, McSweeney, and 
Richardson ( 8:85) that the learning-to-learn phe- 
nomenon occurs at an early stage in beginningread- 
ing training. Second, the high levels of mastery ob- 
tained on the Criterion Test representa typical 
result with programmed instruction, The usualbell- 
shaped achievement distribution yields to a cluster- 
ing of scores around a high level of achievement. The 
high mastery results from the programmed 
monitoring techniques and reflects the 
success of applying these techniques with a dyslexic 


population. 
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The evidence collected here on the generalized 
blending skill adds to the evidence already cited CE 
10,14) that children can be effectively trained in ap- 
plying this skill. However, this study represents an 
extension of these earlier results to dyslexics as well 
as suggesting other insights into the nature of this 
skill. It was shown that performing a visual analy- 
sis interfered with the control Ss' ability to readthe 
words tested, The implications of this result be- 
come apparent when one considers that theseSs dem- 
onstrated some minimal phonic skills (e. g. , initial 
sounds) while the application of these skills actually 
interferes with his ability to decode words. The in- 
teresting question implied by this is: How much and 
what kind of training is necessary to transform these 

into useful decoding skills? Another interesting in- 
sight intothe blending skill is provided by the evidence 
that it is applied differentially to meaningful and non- 
sense words. On the one hand, it is clear that the 
Skill led to the successful decoding of a greater per- 
centage of novel meaningful words than nonsense 
words. On the other hand, when the visual analysis 
was a required part of the response, there was a 
greater increase in the percentage of nonsense words 
correctly decoded than meaningful words, One pos- 
sible interpretation of these results is that children 
were using their blending skills when simply asked 
to read the word (the '* look-say’’ response) but 
were more successful in applying it to words already 
in their vocabulary ( meaningful), However, being 
specifically instructed to analyze the words orally 

( blended response) Significantly increased the num- 
ber of words they could decode for both categories. 
It would follow, then, that there would be a greater 
increase for the nonsense words since several of the 
meaningful words had already been successfully blend- 
ed when the look-say response was required, 


These data raisequestions of how decoding skills 
are actually used in reading. Do they become highly 
integrated processes that occur at a high rate of 
speed? Do they merely serve to help the reader de- 
code an unfamiliar word until that word becomes in- 
tegrated into his sight vocabulary? These data sug- 
gest different ability levels in the application of the 
blending skill. When does this skill emerge? То 
what degree of utility canthis skill be attained? These 
questions, and a hostof other relevant questions, 
merely reflect the primitive state of our knowledge 
of the process of decoding, 


Finally, although the work reported here is lim- 
ited, the initial success with this program would in- 
dicate that a highly-structured, individually tutored, 
programmed approach may be a promising avenue of 
investigation leading to real solutions to the dyslexic 
problem. Such solutions would take several more 
years of developmental work extending these tech- 
niques to other decoding skills and to comprehensive 


skills. 
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ABSTRACT 


i Students in one school system in grad 
ae which incorporated a measure of risktaking. 
girls), In each case the proportion of risk-taking v 


imately , 10 (significant at the . 05 level), with higher risk in grades 5, 6, an 
1 systems, but the proportion of risk-taking variance explained 


Boys took greater risks than girls in both schoo! 


by sex was low ( approximately . 01) and significant (at the . 05 leve 


interaction between grade level and sex. 


be снн risk taking is becoming an increas- 
logic itor esting variable in educational and psycho- 
101са] research, comparatively little is known about 
Kogan (13) {о аре or sex. For example, Wallachand 
inating ) found that older Ss (mean age approxi- 
dinis 0) were more conservative than college stu- 
In Cadre a hypothetical choice-dilemmas instrument. 
Кошаев of 6- to 10-year-old children, Kass (4) 
à slot en a difference in gambling with pennies in 
risk-taking ine. However, Cohen (2: chapter 5) ina 
Са ШЕ eltnation with candy:for pelses, found that 
-old children took greater risks than 12-year- 


olds А 
olds’ Who in turn took greater risks than 15-year- 


ү Tespect to sex, Wallach and Kogan (12, 13) 
little еуі no consistent sex differences іп risk, and 
Owever eo feminine conservativeness. Kass (4), 
with tier found boys selected greater risks than girls 
decision- ot machines. On the other hand, using а 
(10) [iege i task with candy as the prize, Slovic 
erence in а Sex-by-age interaction; і. e. , no sex dif- 
younger children (ages 6-10), but withold- 


ег child: 
by Boys. еп (ages 11-18) greater risk was manifested 


Risk 
s detingg КОЕ оп objective examinations (RTOOE ) 
hat th S guessing when the examinee is aware 


еге і 
is a penalty for incorrect responses (6). 


ез 5 through 11( 522 boys, 548 girls) responded to an objective exam- 
The study was replicated in a second school system (600boys,691 
ariance associated with variation in grade level was approx- 


d 7 than in grades 8,9, 10, and 11, 


1) in only one school system. There was no 


A previous study (7) noted (a) the potential useful- 
ness of RTOOE to psychologists as a disguised mea- 
sure of risk taking, and (b) the effect of RTOOE on 
test score, which merits the attention of individuals 
concerned with educational measurement. Specific- 
cally, with RTOOE, no sex differences were found 
in college students (6, 8, 9). However, eighth-grade 
females were found to be greaterr isk takers than 
eighth-grade males (7), while the opposite was found 
for ninth-grade students (11), The purpose of the pre- 
sent study, therefore, was to (a) devise measures of 
RTOOE that would be appropriate for use in grades 5 
through 11, and (b) administer the measures to suit- 
able grades in order to observe the relation of RTO- 


OE with grade level or sex. 


METHOD 
The measures of RTOOE were based upon the use 
of nonsense items, where a nonsense item is defined 
as one that has no correct (or best) answer, andno 
incorrect answer for the given population. Previous 
research ( 6, 7, 8, 9) has suggested that five nonsense 
items embedded in five legitimate items would pro- 
vide suitable test characteristics. In addition, since 
there is evidence that RTOOE is а general trait 
of examinations (7), conve- 


across different types 
nient synonym-antonym vocabulary items were the 


type employed in the measures. Subj ects were 
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directed to indicate whether the words had the same 

or opposite meaning, and were informed of the pen- 

alty for incorrect responses. The following is ап ex- 
ample of a nonsense item used in the measures: 


?. mamel........... mild 


Since **marnel"' is meaningless, the item has no an- 
swer. Hence, any response (і. е. , “ваше” or ‘‘ op- 
posite’’) is assumed to be an example of RTOOE be- 
havior; if the item is omitted, a lack of RTOOE be- 
havior is indicated. 


In order for grade trends tobe examined, the non- 
sense items which formed the basis of the risk mea- 
sures were constructed so that they could be usedat 
all grade levels; the legitimate items were selected 
to be appropriate for the particular grade level, and 
generally appeared at a single grade level. The RT- 
OOE score assigned to an S wasthe proportion ої non- 
Sense items attempted. The instruments to measure 
RTOOE were constructed to the above specifications, 
and were tried out on selected classes from grades 
5 through 11 in a large village in western New York 
State. The procedure used for estimating reliability 
was the Kuder-Richardson formula 20 (K-R 20), 
which is a measure of internal consistency,or the 
tendency of the test items to be homogeneous ( e.g., 
3,161). The analysis of these preliminary data re- 
vealed a median K-R 20 of .78 across the seven 
grades for the RTOOE measure. Additional evidence 
of the reliability and construct validity of this meth- 
od of measuring RTOOE is provided elsewhere (7). 


Subjects for the study were all available public 
school students in grades 5 through 11 in the same 
New York village where the preliminary data were 
collected. There were a total of 1,070 Ss, consisting 
of 522 males and 548 females. The number in each 
grade ranged from 118 to 228. Theentire study was 
then replicated in a small city in northern Michigan, 
with a total of 1,291 Ss, consisting of 600 males and 
691 females. The number in each grade for the Mich- 
igan study ranged from 140 to 208, The tests were ad- 
ministered to the Ss in their own classrooms by their 
own teachers. The teachers had been previously in- 
structed as to standardized procedures of adminis- 
tration. TheSs were generally led to believe that they 
were taking another aptitude examination in their 
school’s testing program. The tests were given as 
Part I of an “ aptitude” examination on the same day 
to all classes in a given school, and within several 
days to the entire set of classes in a school system. 


TABLE 1 
K-R 20 RELIABILITIES FOR GRADES 5 THROUGH 11 


grade 


ары % 6 7 в 9 10 m 


RESULTS AND DISCUSSION 


The K-R 20 reliabilities for the RTOOE measure 
are presented in Table 1. Thevalues for the seven 
grades in the New York system ranged from . 68 to 
. 86, with a median value of . 83. With the Michigan 
Ss, the reliabilities ranged from .78 to . 88 witha 
median of . 86. Hence, it appears that the RTOOE 
measure was generally reliable across grades 5 
through 11 for both school systems. 


Table 2 presents the mean risk scores for the 
New York State Ss by sex and grade. Numbers in 
parentheses are the sample sizes; the estimate of 
mean square within and its corresponding degrees 
of freedom are provided at the bottom of the table. 
А sex- by -grade factorial analysis of variance was 
performed; because the cell frequencies were un- 
equal, an exact least squares analysis proposed by 
Bock(1) was utilized. The results indicated that the 
grade effects (with the effects of sex eliminated) 
were significant at the . 05 level. The proportion of 
RTOOE variance associated with variation in grade 
level or 12 (5) was approximately. 11. Neither the 
sex effects (withgrade effects eliminated) nor the 
sex-by-grade interaction effects (with both main ef- 
fects eliminated) were significant at the . 05 level. 


The results for the Michigan replication are pro^ 
vided in Table 3. Once again, the grade effects, 
( with the sex effects eliminated) were significant а 
the . 05 level, with a corresponding n? of approxi- 
mately . 10. In addition, the sex effects (with the 
grade effects eliminated) were significant at the . 0 
level. However, 1? for the latter was approximately 
:01. As before, the sex-by-grade interaction effect? 
(with both main effects eliminated) were not sign 
cant at the . 05 level. 


It would appear, therefore, that there is evidence, 
that grade level (or age) and RTOOE are related, 27! 
that the relation is fairly sizable (approximately 10 
percent of the variance in RTOOE is accounted (07 
by differences in grade level). From an inspectio? 
of the data, it would further appear that Ssat grade. 
levels 5 through 7 have greater mean RTOOE (һал 27 
at grade levels 8 through 11, Scheffe post hoc. core 
parisons confirm this conjecture at the . 05 level 
both the New York and Michigan data, The finding 
that risk taking was greater for younger children i 
consistent with the results of Cohen (2). 

The conclusions with respect tosex are not a 
clear as those with respect to grade level. Th 
some evidence with the Michigan data that sex 18 ini 
lated to RTOOE (.051evel), withthe data indicati? 
that males are greater risk takers. However, t” а” 
relation between sex and RTOOE for the Michigan 
ta was not strong (approximately 1% of the var ian 
in RTOOE can be accounted for by sex), and the ta. 
lation was not replicated with the New York а 
Hence, the findings with respect to sexare some" 
similar to that of Wallach and Kogan (12, 13) іп ind", 
they are not consistent. At the same time, the f that 
ings somewhat support the results of Kass (4) M е 
males were greater risk takers than females ing? 
of the school systems, It is, however, interest tio” 


New York .78 note that the present findings are in direct орР09 ade 


to a previous study (7) which resulted in eighth бр 


Michigan .88 .86 females demonstrating consistently higher К 


те 
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TABLE 2 


MEAN RISK SCORE FOR NEW YORK STATE Ss (SAMPLE SIZE IN PARENTHESES ) 


grade 


MS, = .0945, d.f. = 1056 


than eighth grade males. Hence, it seems that the 
relation between sex and risk (or more specifically 
between sex and RTOOE ) remains unclear. 


It should be noted that grade level differences in 
mean RTOOE are confounded not only with differ- 
ences in Ss (since these are cross sectional studies) 
but also with differences in teachers or test admin- 
istrations, The teacher influence could certainly be 
Potent since in their classes certain teachers might 
advise their students to '* guess at everything" or to 

omit questions unless you are sure of the answer," 
etc. The administrator effect would appear if, onthe 
Criterion examination, certain teachers strayed in- 
advertently from the standardized procedure. 


In order to investigate the possibility of a teacher 
Or administrator effect, an expanded analysis of the 
Present data was performed. The expanded analysis 
Considered classes as nested within grades (Ss who 
Shared the same teachers or took the criterion ex- 
1mnination together were considered as one class). 

п addition, both grade and class-within-grade were 
Crossed with sex, For the New York Ss, only the 
rade effects were significant at the . 05 level, with 

е corresponding ?approximately equal.to 11. 
Gee there was по evidencethat classes 

T administrations within grades were an im- 
Portant source of variation. 


TABLE 3 


With the Michigan data, however, both the sex- 
by-grade and the sex-by-class-within-grade interac- 
tions (each with all other effects eliminated) were 
significant at the . 05 level. For each interaction, the 
»'was about .01. Classes nested within grades (with 


all main effects eliminated) was significantat the 

. 05 level, with 12 approximately equal to . 12, Hence 
the results from the Michigan data appear tobe much 
more complicated than those from the New York da- 
ta, with some evidence for a teacher or administra- 
tor effect, but also with some weak interactions; 
e.g., the sex effect varied from class to class with- 
in agiven grade. A possible explanation for the 
more complicated Michigan data is the much short- 
er time period spent withthose teachers intheir ori- 
entation session; i. e. , the strong relation between 
classes within grades and RTOOE is perhaps simply 
reflective of a lack of success in accomplishing stan- 
dardization in the administration of the tests, In ad- 
dition, the two significant interactions were both 
weak, and one (sex-by-class-within-grade) difficult 
to explain. ( A plausible conjecture that the sex of 
the teacher was involved proved to be groundless. ) 


there is evidence that RTOOE is re- 

age), with higher RTOOE as- 
6, than with grades 8,9, 10, 

n RTOOE and sex, 


In summary, 
lated to grade level (or 


sociated with grades 5, 
11. If there is a relation betwee! 


M 
EAN RISK SCORE FOR MICHIGAN Ss ( SAMPLE SIZE IN PARENTHESES ) 


9 


grade 
e% 5 6 7 8 
n UL .83 .82 151 
(83) (63) (93) | (104) 
E 3 
` (92) 


w 5.1340 


[”[» | a | 


» d.f. = 1275 


10 
.55 
(95) 

42 
(103) 


.64 
+7 +78 “1 +64 
(77) (104) (100) 


u 
E 
(82) 

65 


(125) 
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it appears to be weak. Finally, there is some pre- 
liminary evidence that the teacher or test adminis- 
trator may have some effect on RTOOE. 


With respect to education and/or psychology, we 
may consider the following implications. First, it 
would appear that students become more conserva- 
tive on RTOOE as they grow older. Whether this in- 
creased conservativeness is due to maturation, the 
educational process, or combinations of these and 
other factors is not known. Since, the present study 
was cross-sectional, one might speculate that the 
mean differences were due toa shifting population. 
For example, school dropouts might be high in RT- 
OOE, and their disappearance from the school scene 
would result in lower RTOOE. However, this phe- 
nomenon would be accompanied by a decrease in RT- 
OOE variance. An investigation of the RTOOE vari- 
ance across grades revealed either a steady pattern, 
or an increase, but not a decrease. Hence, the drop- 
out conjecture seems less plausible. Planned lon- 
gitudinal studies should shed more light on the shift- 
ing population hypothesis. 


Second, since there is evidence that RTOOE af- 
fects aptitude or achievement Scores, test construc- 
tors should be aware that (a) there may be an ad- 
ministrator effect on RTOOE, (b) lower RTOOE is 
more characteristic of grades 8through 11 than 
grades 5 through 7, (c) there may be school or geo- 
graphic differences in RTOOE (note the mean differ- 
ences between the New York and Michigan school 
Systems), and (d) the mean difference in RTOOE 
Íor boys and girls appears to be small. 


Lastly, there is evidence that reliable measures 
of RTOOE can be constructed for Students as early 
as the fifth grade. 


FOOTNOTE 
1. This research was supported by the Teacher Ed- 
ucation Research Center, State University Col- 
lege at Fredonia, New York. 
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ABSTRACT 


The i 

шп ке of (1) Navaho, Spanish, and Anglo-American ethnic backgrounds, ( 2) response options of in- 

кыа Док уа, Pu and indeterminate categorization, ( 3 ) two degrees of stimulus complexity, and ( 4) sex 
ed during information processing in concept attainment, Subjects were a stratified sample of 


eighty four ni 

E EE Егас students selected from all the ninth grade students attending a Bloomfield, New Mexico 
( TIPT). A desi 2 e dependent variable consisted of responses to the Tagatz Information Processing T est ! 
ава іа) Е X тереза теаѕигеѕ analysis of variance yielded four significant sources of variance. The 
option А cd [ег ormances were each significantly different from the other. The ethnic group by response 
tins at РЕ п that the hierarchy of Navaho, Spanish,and Anglo-American was not evident for 
tically different Abe е categorization. For items of this type, the means of the three groups were not statis- 
complex presántat m each other. The response of stimulus complexity interaction indicated that items in the 
ation with an exclusion response were significantly different from all other cells in the inter- 


action, 


needed to determine such factors as ( 1 ) the cause 
of the large differences among racialgroups in 
school performance and test scores, ( 2) dysgenic 

trends, and (3) the best method (s) to educate cul- 


turally distinct groups. 


bit De HET of a person's genetic and cul- 
Seneca E upon his IQ score and academic 
recent lit e has received considerable attention in 
hypothe: terature (1,2,3,4,5). Jensen (3) has 
Sized that 80 percent of the IQ variance of 


ur ope: 
mined a ae North American populations is deter- 
other ыу (H). Elkind (2), on the 
Bet’s Struct interprets differences in terms of Pia- 
develop uralism, believing that intelligence is 
Ped through experience. 


T ы 
indicates ехалаілаНов of the pro and con arguments 
and his pb areas of agreement between Jensen 
ifferent vaca The foremost is that members of 
dis eat cultures, and socioeconomic strata 
е Cause сап ee differing intellectual abilities. 
Ог a combination genetic, psychogenic, cultural, 
88 they pre, ation of these factors, and most schools 
ег than minjo tty Operate, appear to maximize rath- 
Substantia] imize these differences (4). Another 
area of agreement is that more study iS 


In a comparison of information processing dur- 
ing concept attainment using third and fourth graders 
within the same school, Tagatz, Layman, and Need- 
ham (9), found no significant differences between 
(SES ). Marascuilo and Amster (8) did find a sig- 
nificant difference favoring the upper SES over lower 
SES children. Their Ss were fifth and sixth grade 

The stimuli in the 


children in separate schools. 
present study are similar to those used by Tagatz, 


Layman, and Needham ( 9 ) where SES factors were 
minimal or nonexistent. Jensen (3), summarizing 
several of his own research articles, found that 
with certain learning tasks lower-class children be 
they white, Negro, or Spanish-American, perform 
as well as middle-class children in the same IQ 
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range. Disparate results such as these illustrate 
the need for research about the nature of system- 
atic differences between sociocultural groups, 


THE PROBLEM 


The general purpose of this study was to in- 
vestigate differences in information processing dur- 
ing concept attainment of ninth grade students from 
Navaho, Spanish, and Anglo-American ethnic back- 
grounds. The following specific question was ex- 
amined: 


What are the effects of the following variables 
on information processing during concept attain- 
ment: (а) ethnic background: Navaho, Spanish, 
and Anglo-American; (b ) response option: in- 
clusion, exclusion, and indeterminate catego- 
rization; (c ) task complexity: two degrees of 
stimulus complexity; and (d) sex, 


METHOD 
Subjects 


Subjects for the study were a str 


h-American males, 
The mean 


The Ss represented a wide range of 
els. The mean IQ score on the Ot 
Test of Mental Ability was 96. 67 
deviation of 16. 12, The Anglo males had 
IQ of 98.74, Anglo females 110, onea 
80. 44, Navaho females 80, 60, Spanish-American 
males 95, 89, and Spanish-Amerj 
In summary, the ethnic groups i 
sented a wide range of intellectual] abilities, 


Experimental Materials 


The TIPT has been deser 
Meinke ( 10 ) and by Lemke, Klausmeier, ang Har- 
ris (6). The TIPT contains Sixty items divided 
into two subtests. The first Subtest co 


ibed by Tagatz and 


: " : nSists of 
thirty items in which one instance, either an exem- 
plar or non-exemplar, is presenteq With an exem 


plar focus instance. The task is to Specify the in- 
clusion, exclusion, or indeterminate 
of another instance to membership in 
stances exemplifying a concept, Еш 
and fifteen non-exemplar items were 
exemplar items, ten test instances—i ч 
for which membership was to be determined—we 
definitely exemplars of the same Concepts that the. 
focus instance exemplified. The membership ч е 
the remaining five test instances could not aes 
termined, In items presenting a focus and anon 
exemplar instance, ten test instances Were 
finitely non-exemplar and five again were ; 
minate, Thus, the first sub-test of thirty + 
could be scored on the йе of tor 
resented—fifteen exemplars an fifteg non 
| edes on the basis of membership. ten cà 


plars, ten non-exemplars, and ten instances of in- 
determinate membership. These three conditions 
of membership constituted the response options 
available to Ss, 


The second subtest of thirty items was con- 
Structed with the use of the same focus and test 
instances as the first, The information presented 
in addition to the exemplar focus instance consisted 
of two other instances, rather than one as in the 
first subtest. One of the two instances for each item 
Was an additional exemplar; the other was the same 
in kind as its counterpart in the first subtest, The 


answers to items of Subtest 2 were exactly the same 
as Subtest 1, 


PROCEDURES 


In order to ensure that the directions and ex- 
amples were fully understood, the TIPT was admin- 
istered over three class Periods. The first was used 
for orientation, Transparencies of the instructions 
Were prepared and used in explaining each part of 
the instructions, Subjects were encouraged to stop 


Period was used to examine the instructions for Sub- 
test 2, As wit 


imposed, 


ANALYSIS AND RESULTS 


Struction of the test, n 


complexity and the ine} 
minate categ 


amely the degree of stimulus 
; e inclusion exclusion, and indeter 
orization of the response option. 


эя Т Sources of variance were found to be 
tha ансау Significant ( see Table 1), These меге 


ish, 
and Аш ац Performances of the Navaho, брап 91 
respect 9-Americans were 25,75, 30.54, and 3% 
i me + Duncan’s Multiple Range Test ro 
i othe each mean was significantly differen 


т | ы 
he mean Performances of the inclusion, €* 


cl Б 
12.99", апа indeterminate response options wen 
саға Muli.) 204 9.2 respectively, Here teas 


Was sicui Ple Range Test indicated that eac 


for- 
ce are reported the means of the Pe tio” 
interaction e for the ethnic group by response 
ierarch "ncan's Range for these аша. дов? 
Was то У of Navaho, Spanish, and Апр1о-Ат ; 
St appar ent in inclusion— type items 
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TABLE 1 


ANALYSIS OF VARIANCE OF INFORMATION 
PROCESSING SCORES 


== 
Source, df MS F 
Sex (S) 1 19. 44 1.38 
- Ethnic Group( E) 2 83. 80 5,95** 
SxE 2 21.79 1.55 
Ss w groups 78 14,08 
. Response Option (В) 2 287. 65 42. 94** 
SxR 2 58 
E.xR . 4 19.19 2, 86* 
SxExR 4 3.81 
R x Ss wgroups 156 6.69 
Stimulus Complexity 
(5С) 1 8.90 2.24 
Sx SC 1 6.45 1. 62 
Ех5С 2 1.23 
SxExSC 2 1.66 
SC x Ss w groups 78 3.97 
RxSC 2 12.18 3.25* 
SxRxSC 2 m 
ExRxSC 4 .30 
SxExRxSC 4 4.95 1.32 
6 3.75 


R x SC х Ss w groups 15 


‚ ps. 05 
QE Р.О! 


“different from every other 
Gell, With items of indeterminate categorization, 
thé hierarchy was only partially evident in that the 
А Spanish апа Anglo-Americans were significantly dif- 
ferent from the Navaho’s but not different from each 
other, For items with responses of exclusion, the 
differences between the groups were not significant. 


gach cell significantly 


The means of the performance scores for there- 
SPonse option by stimulus complexity interaction and 
, Duflcan's Multiple Range Test for these data are pre- 
Sented in Tàble 3. Here the complexity created by 
the addition of another exemplar in each item inter- 
fered with performance on exclusion-type response 
items. Inthe complex presentation, two exemplars 
апа one nonexemplar were shown, and it seems that 
this additional exemplar was given attention, to the 
detriment of the non-exemplar processing. Items 

» 


TABLE 2 


CE SCORES FOR THE 


ONSE OPTION 
AN'S NEW MULTIPLE 


MEANS OF PERFORMAN! 
ETHNIC GROUP BY RESP 
INTERACTION AND DUNC 
RANGE TEST 


Duncan's 


Inclusion Exclusion Indeterminate Range 


Anglo 15.36 8.35 10.50 
Spanish 12,80 7.50 9. s 1.57 
Navaho 10,61 7.82 1.3 


TABLE 3 


MEANS OF PERFORMANCE SCORES F 
OR T 
Е) БЯБОН5Е DETON BY STIMULUS С OMPLEXITY 
N AND DUNCAN’ 
CU PER S NEW MULTIPLE 


А А Duncan’s 
Inclusion Exclusion Indeterminate Range 


Single 
Presen- 
tation 6.39 4,38 4.68 
.67 
Complex 
Presen- 
tation 6.56 3.51 4.58 


in the complex presentation with an exclusion re- 
sponse were significantly different from all other 
cells in the interaction. 


DISCUSSION 


The purpose of this study was to investigate dif- 
ferences in cognitive functioning of adolescents from 
the three ethnic backgrounds, but not to determine 
whether heredity or environment is the cause for such 
differences. While this latter question has academic 
t, the task of schools is to educate all citizens 
and, as such, functional differences are 
ll as theoretical importance. 


interes! 
optimally; 
of practical as we 


sion that is reached from the data is 

e differences exist among thesethree 
ethnic groups. Beyond the total quantitative differ - 
ences observable among the three groups, differences 
were found among groups in the degree of success 
with which various kinds of information were рго- 
cessed. The response option by ethnic group inter- 
action revealed marked differences in the ability of 
the groups to deal with material exemplary of a con- 
cept. A similar trend was found when the material 
was indeterminate in its ability to yield relevant їп- 
formation about t When the 
stimulus was designe 
instance to a conceptual category, © 
among the three groups were not evident, 


d out by Lemke, Tagatz, and 
Meinke (7) that concept attainment of exemplar in- 
formation correlates highly with all curricular fac- 
tors. This was not the case with processing of non- 
exemplar information. Most cognitive learning is 
based on illustrating concepts, and less attention is 
given to differences between conceptual categories. 
This suggests that a learning set may be operative 

in much human learning similar to that found by Ta- 
gatz, Walsh, and Layman (11), where Ss initially 
left to their own devices did not adapt to the exigen- 
cies demanded of them in information processing of 
both exemplar and non-exemplar types ( conservative 
instructions). It is further suggested that, when pro- 
cessing à novel type of information ( exclusion type 
items), differences among the three groups are not 


The conclu 
that perfor manc: 


It has been pointe 
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and Scholastic Achievement р Harvard Educa- 
tional Review, 39: 1-193, uL ER 
= кеуіеуу 


4. Jensen, A.R., “ Reducing the Heredity- En- 
vironment Uncertainty: A Reply,” Harvard Ed- 
ucational Review, 39: 449-483, 1969, — ——— 

negates the second of these alternatives, Perfor- 

mance differences on the novel exclusion type 
items were not evident, even though the Ss repre- 
sented sucha wide range of ability levels, This ex- 
planation Supports Elkind’s Conception of intelligence 
vather than Jensen's, 


5. Jensen, A.R., « Social Class, Race, andGe- 
netics: Implications for Education, ” American 
Educational Resource Journal, 5: 1-42, 1068, 


ОБУ, 58: 27-35, 1967, 


ing and Curricular Achievement, ih Journal of 
Experimenta] Education. 38: 70-75, 1969, 


human functioning, When an additional exemplar was 
Presented, performance deteriorated markedly for 


information, It isa cognitive focusing on figure to 
the exclusion of ground so often seen in other percep- 
tual tasks, [t is important to note, though, that in 


be remiss in Properly Preparing Students for such 


practices, Tagatz, G, E. ; Layman, J, 4 Needham, J, R., 


“ Information Processing of Thira and Fourth 


Grade Children, » Contemporary Education 
in press, 
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ABSTRACT 


It has often been suggested that i 
were conducted to test the hypothesis 
ly desirable than a person whose high regar 
were opposite to those predicted. It appear 


one’s status. Apparently, 


FIFTEEN YEARS ago, teachers were clear 
about their assignment; the formulation of behavior- 
al objectives was restricted to the mastery of the 
three R’s. Today, teachers express great concern 
about their students’ social adjustment, peer rela- 
tions, self-concepts, mental health, and moral and 
personality development. In fact, many educators 
argue that such adjustments are prerequisite for ac- 
ademic achievement. It is not uncommon to find that 
goals concerned with social and personality adjust- 
ment take priority to goals concerned with concept 
attainment in many experimental or innovative pro- 


grams. 


At the same time, there are very few guidelines 
provided for the teacher who wishes to contend with 
the emotional needs of his students. Neither teach- 
er trainees’ preparatory course-work nor the re- 
search literature provides such information. Thus, 
even the conscientious teacher is forced to rely en- 
tirely on his own intuition in these areas. 


nvestigates some of the fac- 


The following study i и 4 
agers’ perception of one 


tors which influence teen- 
another's social worth. 


THE ELUSIVE PERSON 


Theodota, a hetaera, on how 


Socrates’ advice to г 
uence people was direct: play 


to win friends and infi 
“hard-to-get”: 
They will appreciate your favors most 


ndividuals will prefer dates who play ‘‘hard-to-get. " Two experiments 
that teen-agers will assume that a hard-to-get individual is more social- 
dis easily obtained. This hypothesis was not confirmed; the results 


5 that playing hard-to-get is not an effective strategy for increasing 
all the world does love a lover. 


highly if you wait till they ask for them, The 
sweetest meats, you see, if served before they 
are wanted seem sour, and to those who had 
enough they are positively nauseating; but even 
poor fare is very welcome when offered to a 
hungry man. ( Theodota inquires ) and how can 
I make them hunger for my fare? (Socrates’ 
reply) Why, in the first place, you must not of- 
fer it to them when they have had enough—but 
prompt them by behaving as a model of propri- 
ety, be a show of reluctance to yield, and by 
holding back until they are as keen as can be; 
for then the same gifts are much more to the 
recipient than when they are offered before they 


are desired (8). 


The notion that an individual can become desirable 
by playing hard-to-get is not only part of our folk- 
lore but part of the folklore of other times and coun- 
tries. While Ovid, the Kama Sutra, and Dear Abby 
all agree that the lover Should not display his affec- 
tion too readily, no experimental evidence exists to 
document the effectiveness of the hard-to-get strat- 
еру. 

There are some correlational data which indicate 
that those who appear to be greatly in need of affec- 
tion are not held in high regard. Ehrlich (personal 
communication, 1969) found that mentalpatients who 
admitted possessing a strong need for approval were 
less popular among other patients and among the 
staff than were other patients. Ehrlich points out 
thather results agree with those reported by Crowne 
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and Marlowe (1) who found a negative correlation 
between the ‘‘approval dependence" of fraternity 
men and their popularity with other men. In these 
correlational studies it is not possible to determine 
if individuals have a strong need for approvalbecause 
they have been rejected by others or if their desper- 
ate need for approval causes them to be rejected. 


If being ‘‘hard-to-get’’ does in fact increase one’s 
desirability, several theories might account for this 
phenomenon. 


1. Dissonance Theor (3): The person who is 
hard-to-get requires a suitor to expend more ef- 
fort in her pursuit than he would normally expend. 
One way the suitor can justify his unwarranted ex- 
penditure of energy is by aggrandizing the hard- 
to-get woman. 


hard-to-get woman can maximize the impact of 
the rewards she provides, 


3. Social Perception Theory: Individuals use in- 
formation as to another’s social Standing on one 
trait as a clue to his standing on related charac- 
teristics. For example, individuals may have 
discovered that very socially desirable dates are 
harder-to-get than undesirable Partners. The 
two concepts ( “hard-to-get”? and “socially de- 
sirable”) might thus become associated. Asa 
consequence, if a girl can Successfully simulate 
being hard-to-get, she may be able to improve 
others' perception of her desirability, 


The first two theoretical explanations ofthe hard- 
to-get phenomenon Suggest that playing hard-to-get 
Should alter only the suitor's perception of the hard- 
to-get romantic partner. Social Perception Theory 
suggests that the hard-to-get individual Should im- 
press an even wider constituency. Not only poten- 
tial suitors, but uninvolved observers ав well, should 
perceive the hard-to-get person as especially so- 
cially desirable. 


The above rationale leads one to hypothesize that 
the more romantic interest a stimulus person ex- 
presses in a given romantic Partner, the less social- 
ly desirable that stimulus person will be judged to 
be by an outside teen-age observer, 


ALTERNATIVE HYPOTHESIS 


An alternative, and somewhat more complicated, 
hypothesis also may be proposed. It could be argued 
that a stimulus person might gain or lose stature by 


expressing romantic interest in another, depending 
on how socially desirable the other is. 


This hypothesis follows from research by Goff- 
man (4), Kiesler and Baral (5), Dion and Berscheid 
(2), and Walster and Walster (7), demonstrating 
that individuals prefer romantic partners of approx- 
imately their own level of “social desirability. ” 
If teen-agers assume that attractive people are most 
likely to express romantic interest in attractive oth- 
ers, while the unattractive will only admit to liking 
the unattractive, the teen-agers might use such 


Thus, if an osten- 
sibly attractive person expresses great romantic in- 
the lover should 
lose stature as a consequence of his liking, while the 
belo An ostensibly unattrac- 
tive person who expresses romantic interest in an 
attractive partner should gain stature by his liking; 
the beloved should lose stature. 


Thus, one may hypothesize that the attractiveness 
of a stimulus person, the attractiveness of his part- 
ner, and the extent of his romantic interest for the 
partner, should all be important determinants of how 
Socially desirable the Stimulus person and the part- 
ner appear to be to an outside observer. 


The two experiments reported here were designed 
to investigate whether Ог not the knowledge that a 
person was hard-to-get affected a teen-ager's eval- 
uation of that person. 
tractiveness of the stimulus person, the attractive- 
ness of his partner, and the amount of romantic 
interest the Stimulus person expressed for the part- 


Experiment 
II was similar to Experiment I in every detail, with 
the exception that the sexual Similarity of the stim- 
ulus personandhis partner was Systematically varied. 


METHOD—EXPERIMENT I 


Subjects and Procedure 


Subjects were 144 high school juniors and seniors 
who belonged to various youth groups in the Roches- 
ter, New York area. They were paid $2.00 each for 
their participation. 


To provide a rationale for asking the Ss to rate 
other Students, the experimenter said She was in- 


vestigating factors which may affect romantic at- 
traction. 


was interested in romantic likin, 
‘friendship. " After this int 
a detailed description of the 
to the Ss; 


We'd like you to help us in settin 


we’ll be running in the Fall. шаа чу 


t know each other, 
meet together four times, 


| 
| 
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don’t have all the information we’d like as yet, 
so you'll just have to bear with us. 


There are three things we would like you to do. 
First, go through the booklet and read all the in- 
formation about both students. Don’tanswer any 
questions right away. Instead, think about both 
of them for a few minutes. Try to imagine what 
they’re both like, how they’d act with one anoth- 
er, and so forth. Then, give us your honest im- 
pressions of them. Don’t tell us what you think 
you should think, or what other people would 
think. Just tell us what you think. Don’t hesi- 
tate to use the extremes in rating if they seem 


applicable. 


After you’ve answered a question, youcan com- 
ment on the question itself, if you wish. Ifyou 
feelit is unclear, or should be put another way, 
then make a note on your sheet suggesting how 

it might be improved. 


Subjects were then given а booklet containing the 
picture and biography of one male and one female 
student. Half of the time the stimulus person de- 
picted in each photograph was physically attractive, 
the remainder of the time he was ugly. Beneath each 
picture Was а paragraph describing the school activ- 
ities of the person depicted. If the person was at- 
tractive, the background information implied that he 
was a very socially desirable individual.” For ex- 
ample, the attractive boy's biography said: 


Bill is 17 and graduated this June from a New 
York high school. During the past year he was 
an active participant їп extra-curricular activ- 
ities at his school. He маза class officer, a 
member of the football team, one of the editors 
of the school yearbook, and a member of the 
band. His hobbies include sports at which he has 
unusual natural abilities. Bill is also an officer 
in one of his community’s youth groups. He 
plans to study medicine for his future career. 


If the stimulus person depicted in the photograph was 
physically unattractive, the background information 
indicated that he was not socially desirable. For ex- 
ample, the ugly boy’s biography said: 


Jack is 17 and graduated this June from a New 
York high school. During the past year, hewas 
not an active participant in extra-curricular ac- 
tivities at his school, but he did help to sell 
rbooks and was a member of the band. 
Outside of school he does some swimming and 
team sports, although he does not have too much 
skill at them. Occasionally Jack attends meet- 
ings of one of his community’s youth groups. 


Finally; Ss were told how romantically interested 
the first stimulus person was in the partner after 
they had met with each other four times. The stim- 
ulus person was said to have likedthe other extreme- 
ly much, not particularly much, or no liking infor- 
mation was provided. Jf the stimulus person was 
«extremely romantically interested" in his partner, 
the following paragraph was added to his biography: 


At the conclusion of their four meetings togeth- 
er, Bill was asked to tell us honestly how much 


liking he felt for Nancy, and how much time he 


would be interested in spending with her in the 
future. He said (1) he liked her extremely 
much, and (2) that he would enjoy spending a 
great deal of time with her in the future. 


If he was not to be particularly interested in his 
partner, the last sentence read: 
He said (1) he did not particularly like her, and 
(2) that he would not want to spend time with her 
in the future. 


If the stimulus person’s liking for his partner was 
to be unknown, the sentence read: 


We do not have information about whether he 
likes or dislikes Nancy. 

Subjects were never told how much the partner 
liked the stimulus person. 

The variations just described yielded a 2x2x3 de- 
sign: Attractiveness of the stimulus person, by At- 
tractiveness of the partner by The stimulus person's 
romantic interest in the partner. Hal the Ss as- 
signed to each cell were male and halí were female. 


Dependent Variables 


After considering the photographs and biographies 
of the stimulus person and his partnerforsome time, 
and imagining what it would be like to associate with 
both teen-agers, Ss were asked to complete a ques- 
tionnaire composed of the following ten questions: 
(1) How popular would stimulus person (SP) be with 
the girls at your school? (2) How popular would SP 


would like SP? (4) How likely is it that SP is the 
kind of person you would want to spend much time 
with? (5) How physically attractive do you think SP 
is? (6) How much would you guess the partner (P) 
likes SP? (7) How likely is it that SP is the kind of 
erson who would want to spend much time with you? 
(8) How physically attractive do you think P is? (9) 
How popular would you guess P would be with the stu- 
dents at your school? (10) What clues did you use 
judgments about each member of the 
feel about 


Scores on questions 1-6 were summedto form an 
Index of the stimulus person's Social Desirability. 
Questions 8 and 9 were summed to form anIndex of 
the partner's Social Desirability. (The lower the 
score on each index, the more socially desirable the 
stimulus person was judged to be. ) 


EXPERIMENT П 


Subjects and Procedure 


Subjects were 128 high school students from the 
Rochester area. 


As previously mentioned, the experimental de- 
ment I was duplicated in Experiment 
TI with the exception that the sexual similarity of the 
stimulus person and his partner was systematically 
varied. This necessitated а modification in the ех- 
perimental procedure. Although E used the same 
rationale in Experiment П as in Experiment 1, she 
could no longer plausibly claim to be interested in 
the factors that affect romantic attraction. It was 
reasoned, however, the Ss would assume that oppo- 
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TABLE I 


i R HIS PARTNER AND 
ENT I: THE EFFECT or А STIMULUS PERSON'S ROMANTIC LIKING ЕО 
THE ATE IHE E OF THE PARTNER, IN DETERMINING Ss’ EVALUATIONS 


ғаз А А7 
Stimulus Person’s Stimulus Person’s Partner’s Perceive Social Desirability of Stimuli 
Romantic Liking Attractiveness Attractiveness 

for Partner 


Stimulus Person 


А а r 
Great Interest Desirable Desirable 14.50 3.58 
БЫ Desirable Undesirable 14.17 7.33 
Undesirable Desirable 18.67 3.08 
Undesirable Undesirable 17.75 7.08 
М-1627 М-5.27 
Unknown Desirable Desirable 15.83 3.25 
esirable Undesirable 15.17 7.58 
Undesirable Desirable 20.25 3. 
Indesirable Undesirable 18.25 7. 
M-17.38 M=5.48 
Great Disinterest Desirable Desirable 16.00 
esirable Undesirable 13.67 
Indesirable 


esirable 


Stimulus Person was ei- 
пе! Partner extremely much or be 
disinteresteq in fu Шегі 


E Person wj рреаг to an 
s in furt interaction with his partner, Outside ob, er), th Ча Were c] * The results 
The Condition in Which Ss Were given no information аге аі, trically Opposed to those Predicteq Fr 
regarding the Stimulus Person’s reaction to his part- Tables 1 and 2 it is evident that the More į te М ы 
ner was not includeq, * Stimulus pe dmit. he is is k nae wee ed 
€t à short tj 
The same Pictures iOgraphies describeg in teen-age assi e esting ete vens d tosirable 
periment Į Te used і Xperiment „апа th Xperim, t I, this 1 ear treng Es 3 ar =ч 
stimulus Pictures once again varied ttractive icant ( p 3.4, 5; p = E aS not quite signif. 
ness. Half the time the stimuli Were е, Tremely ар. ever, similar result, А Xperiment » how- 
tractive, half of the time extremely unattractive, effect tistical, E Ted and this main 
Appropriate backgroung information Was once again More the timulu E icant (р 8.11). The 
Provided, and 55 Were asked to ansWer the same Socially irable 2. Кей his р › themore 
questionnaire administered in Experiment I. Person to be, Регсеі d th stimulus 


The experimental] variations in E. eriment II, 
then, yielded a 2x2x2 design: Хи; Кун of 
and p by Attractiveness of 8, yA i 


mulus Person's Socia 
b ed by i 
of P by SP's Romantic Interest in P 


= 1 Desirability Index 


{ 5 in answers to Sixques- 
| tions, т, 5 examine each of the Six questions ^ 
RESULTS n E. 

E uL Persos me Р па that whe the stimulus 
Manipulation Check tically interesteq n his partner he 
—ANbulation Check 15 evaluate ighly on а Six questions than 

The social desirability Of the stimuj; When he į Sinterestoq in his 
fully manipulated ан ссевв- question ife res On Only one 
tin, manipulated, jud experiments" the attractive statistica] sigino (тъ He ren ce ciu 
Stimulus berson Was Judged to ? more ocially de. est Sp е En © ( More romantic inter- 
EE Man the unattractive Person (in Experiment Stimul: reed i he оге Ss sumed that the 
1, F=63, 22 in Experiment П, Fe 25,58) 7 a Experi сноп есіргосаге his liking p - 70.) 
: ers i 
The attractive Partner was algo judged to be more рег wa Bain rat iali s erage 
(ocially desirable than Was the unattractive Partner Ир е Soci bil nly three ofth 
(in op riment I, F= 190.68, in Experiment T, PR. items, ho lonly 1 eeo ese 
195.05). Е 
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ТАВІЕ 2 


EXPERIMENT Ш: THE EFFECT OF THE STIM- 
ULUS PERSON’S LIKING FOR HIS PARTNER ON 
Ss’ EVALUATIONS 

—————- 


Stimulus Person’s Sex of the Perceived Social 


Liking for His Stimulus Desirability of 
Partner Person and Stimuli 
His Partner 
Stimulus His 
Person Partner 
Great Liking? Same Sex 17.07 5.12 
Great Liking Opposite Sex 18.04 6.41 
Great Disinterest Same Sex 19.16 6.44 
6.57 


Great Disinterest Opposite Sex_19.41 
a. №32 per cell 
b. The lower the number, the more desirable the 


stimuli. 


(Е = 9.82), the more time Ss wanted to spend with 
him (F = 6.10), and the more Ss assumed the part- 
ner must have liked him ( F = 25.43). 


Alternative Hypothesis 


with respect to the alternative hypothesis (that 
whether or not a person gains or loses stature by ex- 
pressing romantic interest in another depends on the 
social desirability of the object of his affection ) the 
data are again clear. There is по support for the no- 
tion that the attractiveness of the stimulus person, 
the attractiveness of his partner, and the degree of 
liking SP expresses for P will interact in determin- 
ing how socially desirable the stimuli are judged to 
be. The alternative hypothesis predicted that unat- 
tractive stimuli would gain stature if they liked or 
were liked by attractive individuals, and attractive 
individuals would lose stature if they liked or were 
liked by ugly individuals. These predicted 3-way in- 
teractions were all nonsignificant. First, consider 
Ss’ ratings of the stimulus person's social desirabil- 
ity: In Experiment I, the predicted 3-way interac- 
tion equalled .47; in Experiment II, F - .00. When 
we consider the 55” ratings of the partner, the re- 
sults are the same: In Experiment I, the predicted 
3-way interaction equalled .22; inExperimentII, F= 


14. 


The complete rejection of this hypothesis is some- 
what surprising. Had the hypothesis been supported, 
the results would have been consistent with the find- 
ings of Kiesler and Baral (5), Dion and Berscheid 
(2), and Walster and Walster (7). In addition, the 
results would have been consistent with the common 
sense observation that individuals assume that they 
Jose stature by liking or being liked by the **wrong" 
individuals. In informal interviews conducted with 
several of the high school girls, many confessed that 
ly embarrassing to be asked out, in pub- 
lic, by socially undesirable boys. 
barrassment probably ari 
an unacceptable person аз: 1 а 
with the problem of publicly rejecting the undesirable 
suitor in a tactful way. However, the reaso! 
commonly cited by the teen-agers for being embar- 
rassed when asked out by 2 “сгеер,?”! Was that my 
friends might think that I'd actually go out with some- 


one like that!" The girls assumed they would lose 
status if they liked or were liked by others less de- 
sirable than themselves. The data collected in the 
present two experiments suggest that theirfears may 
be groundless. 


In sum, the present data indicate that people sim- 
ply like people who like people. Thereisno evidence 
for the hypothesized effectiveness of a hard-to-get 
strategy. Both hard-to-get hypotheses failed to re- 
ceive even a suggestion of support. 


FOOTNOTES 


1. This research was financed in part by National 
Institute of Mental Health Grants 16661 and 
16729 and in part by the Office of the Dean of 
Students, University of Minnesota. We would 
like to thank Elaine Rosenwasser for running 
this experiment. 


2. “Social Desirability” was defined by Walster and 
Walster (7) as “Тһе sum ofanindividual's so- 
cialassets, weighted by importance and sa- 
lience for others.” Social assets such as 
physical attractiveness, popularity, person- 
ableness, and material resources were pre- 
sumed to be important factors in determining 
one's social desirability level. 


3. An experiment was run with Rochester high school 
seniors to insure that the photographs and bi- 
ographies of the “socially desirable” stimuli 
were perceived as more desirable than were 
the photographs and biographies of the less 
desirable stimuli. 


4. In Experiment I, df = 1 and 96. In Experiment 
П, dí - 1 and 112. 
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ESTIMATED EFFECTS OF FOUR FACTORS 


ON ACADEMIC PERFORMANCE 


BEFORE AND AFTER TRANSF ER 


SAM С, WEBB 
Georgia Institute of Technology 


Our factors to academic per- 
Average Grades “ before” transfe: 


averages, the contribution of four factors—qi 
advanced work, academic pot. 


ential, and coping ability—to the difference between th, 
and the average for the first quarter «« after "* transfer for the transferring students Were estimated, In the il- 
lustrative data, the largest Contributor to an observeq decrement of 1,2 etter Brades was differentia) grading 
Standards. The factor Seemingly Contributing least to the decrement was Preparation for айуа k 


WHILE VARIOUS Procedures (10, 12, 18) ar e 
available for studying the academic Perf, 
college transfer Students, the approach is decline in average grades immediately after trans- 
lowed emphasizes the comparison of academi Б 


г f forty-six studies. (10), forty 
he com- of forty-one studies), Average Brades improve in 
Subsequent quarters ( (5) 


ormance of 


-nine One studies) 
( k umulative дү ed 
tion rates (9, 14), time of graduation ( 15), and pro the before tp. fer average ( oye oe ae 
Portion on probation ( 17,18), have been employed, Studies), Native Studeni a 
Most studies following the before and after approach 


ts tend to Sarnhigher grades 
А 1 than transfers ( (5) twe ty tl 
have Considered primar ily earned grade i 


i nty-two of twenty-three 
n сену Point aver- Studies ), hile transfers have higher attrition rates 
ages (GPA S)asa criterion measure, | 19) add гайак, тр шег tlHonraie 
5) nin, ә 
Some investigations ( 1,2, 4,17, 18) have Жый (5) nineteen of twenty-one Studies), 
ered students transferr ing from one 4- 


c ilit i an 
another; but the vast majority of Studies have been be little doubt that t ese genes uals, тош. 
based on junior college students transferring to se- describe the academic Performance of junit caters 
nior institutions , Transfers fro red in te mais anda enin 
i › Since resu 


iations i Í col ip- 
While there are variations in results as reported tive of Students from no par, теза аге id ed 
from study to study, several Strikingly consistent by Virtue of their poo. design Tenet Rate 
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seldom assist in assessing the role of various fac- 
tors which may be influential in producing the results 
described. 


For example, several studies show that osten- 
sibly significant differences in averages favoring па- 
tive students are substantially reduced or disappear 
when differences in academic potential are taken into 
account ( 8,9,11). Further, Knoell and Medsker 
( 10:96-98) have illustrated the influence of insti- 
tutional characteristics, non-intellective charac- 
teristics of students, and general cultural factors on 
after transfer performance by demonstrating signif- 
icant relationships of type of 4-year college, sex, 
and state differences to after transfer performance. 


These and other findings suggest that after 
transfer performance may be jointly determined by 
various environmental and student characteristics 
acting together, so that by virtue of the singular com- 
bination of these factors for any given pair of colleges— 
one from which students transfer and one to which 
students transfer—the combination of factors and their 
relative degrees of importance in determining after 
transfer performance for the pair may wellbe unique. 
Suggestive of this possibility is a finding by Willing- 
ham (18) showing that optimal correctional weights 
for predicting grades at Georgia Techíor transfer 
students classified into sixteen homogenous groups 
ranged from -1. 1 to 0. Also suggestive is a report 
on nineteen Florida junior colleges indicating that 
average grades after transfer ranged from an increase 
of 0. 02 of a letter grade to a decrease of 0. 74 of a 
letter grade ( 1 ). 
findings suggest that while a vari- 


he after transfer perfor- 
esently available studies 


In summary, 
ety of factors may relate tot 
mance of college students, pr 
offer little assistance in identifying them or in as- 
sessing their relative influence on such performance. 
Further by virtue of the variety of—and even unique 
combinations of—such factors that may be found in 
pairs of colleges, studies based on students trans- 
ferringfrom one single school to another single school 
may be helpful in understanding the dynamics under- 
lyingthe academic performance of transfer students. 


PURPOSE 


The present study attempts to assess the con- 
tribution of four factors as they affect the compara- 
tive performance before and after transfer for agroup 
of students who transferred from one junior college 
to one 4-year college. The factors considered ar e 

ading standards, preparation for advanced work, 
academic potential, and coping ability. It was ex- 

ected that the results would provide a better under- 
standing of the dynamics of transfer from the one 
school to the other and thus make possible better 
guidance for students who might consider transfer- 
ring. In addition, results might facilitate the trans- 
fer process for students who do transfer. 


METHOD 


this study the school from which stu- 


T 
Шр der der school whilethe 


dents transferred is called the fee 
school to which they transfer is са. o 
Students (N= 130) (subsequently called“ transfers’’) 


lled the host school. 
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who transferred from a feeder school with a curric- 
ulum closely resembling that of the host school were 
selected for study. They all transferred during the 
period from fall 1961 through fall 1965. Records 
were studied through spring of 1966. 


A comparison group of students at the host school 
( hereafter called “ natives ’’) was selected from 
the 1961 freshman class to match the transfer stu- 
dentson a person-for-person basis primarily in re- 
spect to academic potential, and secondarily in re- 
spect to quarters of enrollment in school. Matching 
in respect to academic potential was made on the 
basis of the verbal (SAT-V ) and mathematical (SAT- 
M) scores of the Scholastic Aptitude Test, with em- 
phasis being given to exact matching on the latter 
score. The performance record of each selected na- 
tive student was then divided into ** before ''апа““ after” 
transfer segments soasto be equivalent formally to 
the segments of the record of his transfer counterpart. 
This division was accomplished by dividing the record 
between twoquarters so that the total credit hours for 
the “ before” transfer segment would closely ap- 
proximate the transferred credit for the transfer stu- 
dent counterpart. Further the “ after ” transfer 
segment was truncated so that the number of quar- 
ters after transfer did not exceed the number of quar - 
ters attended after transfer by the transfer student 


counter part. 


Summary statistics for the matching variables 
and for several other variables descriptive of the 
groups are shown in Table 1. The predicted aver- 
age ( PA ) was obtained by use of a linear equation 
for predicting first year averages at the host school 
from SAT scores. It was derived from data for the 
entire entering freshman class. Predicted and earn- 
ed averages were computed on the basis of a scale 
in which A=40, B=30, C=20, D=10and Ғ-0. 


For both groups of students average grades for 
the before transfer segments of their records for 
verbal subjects (English and social science courses), 
for quantitative subjects (chemistry, mathematics, 
and physics), and for all work taken (excluding phys- 
ical education, band, etc. ) were computed. These 
average values and their differences obtained by 
subtracting native from transfer averages are shown 
in the top half of Table 2. 


Also average grades for various seg me nts of 
work after transfer were computed for both groups. 
These included all courses taken in the first quarter, 
verbal subjects taken in thefirstthreequarters, quan- 
titative subjects taken in thefirstthreequarters, all 
work taken in the first three quarters, all worktaken 
inquarters 4through 6, allworktakeninquarters 7 
through 9, and the cumulative average for all work 
taken. These averages and their differences are 
shown in the lower portion of Table 2. 


Finally, estimates of the contribution of thefour 
factors of concern to the comparative before and after 
transfer performance for the transfer students were 
computed by methods subsequently described, 


RESULTS 


Comparability of Groups 


In selecting the matching native students, 
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TABLE 1 


DESCRIPTIVE DATA FOR MATCHING TRANSFER AND NATIVE SAMPLES 


Measures 


Transfers Natives Difference 
N M SD N M SD 
130 425 85 130 442 75 -17 
T-V 
eee 130 514 т 130 519 69 -5 
High School Average 40 29.8 6.1 130 30.3 5.5 = 0, З 
Predicted Average 130 9, 3.0 130 19.3 3.1 = 0; 
Hours Credit Transferred 130 77.8 24.4 130 73.3 31.1 4.5 
Total Quarters Enrolled 130 1" 3.47 109 10.2 3.16 1.0 
Quarters Enrolled before Transfer 130 Б; 1.58 117 5.2 1,67 0.5 
Enrollment by Curriculum 2 
Engineering 72 54 18* 
Industrial Management 18 34 -16 
Science and Architecture 10 12 -2 
Status at End of Study * 
Graduated or Still in School 63 16 
Withdrew or Dropped 37 


+ + 


-13* 
24 

* Significant 5 Percent Level of Confidence, 

a 


Reported in Percents, 


emphasis was givento Selecting а 
а person-for-person basis, b 
fer students in respect to асай 
Sible. Since the differences p, 


TABLE 2 


etween grou. 5 in - 
SAT-M, high group: re 


was no si 
to hours 


of credit tran 
be noticed, however, 


Snificant difference 


Sferred 
that the 


SELECTED GRADE POINT AVERAGES FOR TRANSFER AND NATIVE SAMPLES 


Measures 


in respect 
. It must 


Before Transfer 


Verbal Average 
Quantitative Average 
Total Average 


After Transfer 


First Quarter Average 

Verbal Average (Quarters 1-3) 
Quantitative Average (Quarters 1-3 ) 
Total Average (Quarters 1-3) 

Total Average (Quarters 4-6) 

Total Average (Quarters 7-9) 

Final Average 


Transfers Natives Difference 
Нн ы o ж М —s5— 
H9 253 5% qo 
TE 9 187 
130 27.6 63 128 156 S : 20 
27.9 ав 129 194 59 % 5 
ШІ аы а 
1 22.7 
|39 20.3 13 102 22.4 F 21 
ЕЛ» 8 205 л. 
122 n 2 63 1 226 5 1 54 
$ 29 50 $5 248 gg 29 
we з вз M отр 15 Pr 
3 8&2 109 23.2 55 т 
i -4, 


* Significant 5 percent Level of Confidence, 


» 
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graduated at the time of termination of the study. 
However, it was not possible to make a closer match 
on these variables without creating significant differ- 
ences on the measures of academic potential. Also 
it is relevant to note that mean SAT scores for the 
matching sample of native students were 77 points 
lower on SAT-V (442 versus 519) and 73 points lower 
on SAT-M (519 versus 592) than average scores for 
the entire freshman class from which the sample was 
selected. 


Comparative Performance of Transfer and Native Stu- 
dents 


While the details will not be cited, the datain 
Table 2 show that even when the two groups were 
matched in respect to academic potential, the results 
follow the pattern typically found in transfer studies. 
Transfer students made higher averages than natives 
before transfer. After transfer there was a decre- 
ment in the performance level of the transfer group. 
Though average grades after transfer for the trans- 
fers showed gradual improvement after thefirstquar- 
ter, and though differences between averages for na- 
tives and transfers became smaller, averages for the 
transfers never equalled or exceeded the averages of 
the natives in comparable segments of the period stud- 
ied; neither did they reachthe level attained before 
transfer. In contrast, natives showed no decrement 
in performance ©“ after transfer"; and performance 


TABLE 3 


SUMMARY OF ESTIMATED CONTRIBUTION OF 
FOUR VARIABLES TO DECLINE IN AVERAGE 
GRADES AFTER TRANSFER 


Source Method of Estimation Amount 


Grading Stan- Before Transfer, (Native)- -8.5% 


dards Before Transfer, (Transfers) 
Coping Ability 
Conservative: After Transfer, First three 

Quarters- 

After Transfer, First Quar- 

ter (Transfers ) -1, 4* 
Dashing: After Transfer, Quarters 

4 through 6- 

After Transfer, First Quar- 

ter (Transfers) -6.1* 


Preparation plus 
Coping Ability 


Before Transíer, (Native)- 


Conservative: 
After Transfer, First Quar- 

ter (Transfer) -3.6* 
pashing: After Transfer, First Quar- 

ter (Native )- 

After Transfer, First Quar- 

ter (Transfers ) -6.9* 
preparation 
Conservative: (Pr eparation and Coping)- " 

Coping -2.2 
pashing: (Preparation and Coping )- oe 


Coping 
( Continued) 
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(TABLE 3 Continued from Column 1) 


Source Method of Estimation Amount 
Academic 
Potential 
Average Grades, Before Trans- 
fer (Native )- a 
Average Grades (Total Host) -3.2 
EsssteiTol(Dashig)Sum 0, 
of Above -18.6 
Estimated Total (Conservative) 
Sum of Above -15.3 
«Actual Total After Tr ansfer, 
FirstQuarter (Transters)- 
Before Transfer (Transfers) 12.1 


* Significant at 5 percent Level of Confidence . 
а Significance Not Tested. 


levels increased with successive periods of enroll- 
ment. 


Contribution of Four Factors to Grade Decrements 
After Transfer 


Table 2 indicates that the transfer group showed 
a decrement of 12.1 (1. 21 letter grades) from the 
average before transfer to the first quarter after 
transfer. The following paragraphs are devoted to 
estimating from the data in Table 2 what portion of 
this decrement can be associated with each of four 
factors: grading standards, preparation for advanced 
work, academic potential, and coping ability. The 
methods of estimation and the resulting estimates 
are shown in Table 3. 


Grading Standards. The grading standard re- 
fers to the judgmental scale instructors use in eval- 
uating (assigning gradesto) the performance of their 
students. Operationally it may be defined as theav- 
erage grade assigned to the work of students whose 
academic potential is at à specified level. Sincethe 
academic potential of the transfer and native groups 
is the same, an estimate of the decrement attribut- 
able to the difference in grading standards for the 
two schools is given by the difference between the 
before transfer averages at the two schools. Thus, 
a difference of 8. 5 points can be attributed to differ- 
ences in grading standards, indicating, of course, 
that the grading standard at the host school is that 
much lower or harder than the grading standard at 


the feeder school. 


Coping Ability. Coping ability refers to the 
kinds of behaviors required of students for dealing 
adequately with a given environment. If the pres- 
ence or absence of such abilities affect academic 
performance, grades should decline when one moves 
from one environment to another, especially if the 
environments are substantially different; and grades 
should improve aS one learns more adequately to 
cope with a given environment. An estimate of the 
effect of improved coping ability can be made by 
subtracting the average for the first quarter after 
transfer for the transfer students from the average 
insome subsequent per iod. Following this procedure, 
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a conservative estimat е effects i ively through 1965, 
е о! ellec i ts respecti ely from 195 
ti timate of the effect. of improved and 88 poin ) T Fa 
coping abi ity on Grades benc transfer Students can earned Mick first quarter grades increased only 
AE subtracting average grades for thefirst - 03 of a letter grade. 
be ol 


des for the 
after transfer from average gra E 
[ue н quarters after transfer. A more ““ dash. 


utable to Coping effects equals -1,4 (conservative Jy 
and -6.1 (dashing), 


ing” tributable to these different Perceptual Srounds can 
estimate of the decrement after transfer attributable be obtained by subtracting the average grade before 
to the combination of inadequate Preparation and in- transfer for the native Students from the average 
adequate coping ability can be made by Subtractin, 
the first quarter average af, 


ut an approxi- 
cedure there is a mate estimate can be made by using average fir st 
ecrement of 6, attributable to the combination of quarter grades for the freshman year. For theen- 
these two factors, 15 estimate may be Somewhat 


а reduction in grades following 
transfer аза function of r i 
increase in Brades following “ tran: 


Combining the Several estimates for the four 
Pected for the natives аѕ а function of improved сор- Variables, the Conservative estimated total decre- | 
ing ability. Ip fact, an increment of 3 3 occurred for ment equals 15.3 and the « dashing 18 estimated to- 
is gro р. 


а; decrement equals 18.6. These total estimates 
are respectively 3. 2 and 6.5h gher 

decrement of 12, 1; the 

extent, 


ability and Preparation which 
cation and which may thus 

Servative may be obtained b: 
quarter after transfer from the total 


SUMMARY AND DISCUSSION 
transfer for the natives, Th 


average before 


=] 
3 18 procedure yields ade- The foregoing Presentation has in essence de- 
crement of -3, 6, Scribed ang illustrateq the use ofa Proposed design | 
d or estimating the Contribution of four factors which 
With these estimates of combined effects avail are Contributions to the difference in academic per- | 
able estimates of the decrem associated only with ormance as asured by average Brades for stu- | 
кегал effects жап Бе Obtaine ubtracting the dents уу 0 transfer from one School to Another, While 
estimate for Coping ability effects {г these values the dat T epo; ere dea with t in par- 
This procedure yields dashing ang Conservative esti- ormance fo] ing transfer tiío d "pn ай an | 
mates of ас wane for ts үа Quarter after trans- ting proc е believed to be peers 
fer of 0. 8 and + © respective у. Cation i ethatt 2 | 
А А Validity to th alysis of pan С app = with TM A 
Academic Potential, Academic Potential refers ments in ach evement tollo ing 25 Well as дес | 
to the intellectual abilities and aptitudes Usually found well be, how, Ver, that h pict transfer, It may | 
associated with college grades, Since the two groups Sufficient) recise ihat ating Procedures are 
used in the study were matched in respect to academ г decrement n Be ident а5агаые en шаш 
ic potential, differences in grades between groups an ain ог ement ig BER mel on] M en the toal 
ferences in grades before er transfer for the lar Situation Teported her o, "Age-as ше е ; ign 
respective groups, Superficially at least, would not Permits th, Possibility that any rate, the desig 
appear to be associated with differences in academic У that th 
potential. 


R € pattern of changes 
4SSociateg With the four fact, ikingl 
ifferent ds among с Огв тау be strikingly 


various School pairs, 
However, it is relevant to recall that 


› except As for the actual esti i қ 
іп гаге instances, grading Standards аге usually fop- Must be {д en efact ШТ, 
mulated in such a way as to keep the distribution of obtaineg by both the m ecremei 
grades about the Same even though the le A 


DServative ana dashing esti- 
mating Procedures ех, ed th ual decr emerit 
"nds overestimate. ay arise from the fact that the 
estimating Procedures are i, emselves faulty and/ 
о identify and correctly 
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host school for example, while Mean SAT-y an 
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assess the interactions of the four variables consid- 
ered. For example, it will be noticed that the con- 
servative sum of decrements for the four factors ex- 
ceeds the observed total decrements by 3. 2 an amount 
which equals precisely the estimated decrement for 
academic potential. This observation is suggestive 
of the possibility that this measure is subsumed in 
the estimated decrement attributed to differences in 
grading standards. 


Another point of potential error that may be iden- 
tified is that of estimating the combined effects of 
preparation plus coping ability. In addition tothetwo 
estimates already noted, estimates based on later 
segments of the performance record can be made. 
For example, estimates for the first three quarters, 
for quarters 4 through 6, and for quarters 7 through 
9 are -5.4, -2.9, and -2. 8 respectively. The obser- 
ved reduction in difference for the successive time 
periods is consistent with the expectation that as the 
transfers experience more and more of the environ- 
ment and of the same level of instruction as that ex- 
perienced by the natives, differences attributable to 
these factors would tend to disappear. 


The fact that the observed decline in decrement 
may not be entirely a function of improved prepara- 
tion and coping ability for the transfer students is sug- 
gested by the increasingly smaller numbers of stu- 
dents as a function of the elimination of poorly 
achieving students in the successively later periods 
of enrollment (Table 2). 


It is possible to estimate the effect of this phe- 
nomenon on some of the differences attributable to 
the combined effects of preparation and coping ability 
already noted by taking into account for the compar- 
ison periods, the average grades made in each of 
these periods by the forty-nine transfer students who 
persisted through seven to nine quarters. According 
to the results shown in Table 4, it would appear that 
approximately half the difference for each period con- 
sidered can be attributed to this phenomenon. 


Other complexities and complications which have 
an effect on the accuracy and interpretation of the sev- 
eral estimates can also be noted, but not easily as- 


TABLE 4 


ESTIMATED CONTRIBUTIONS OF DROPOUT 
STUDENTS AND OF PREPARATION AND COPING 
ABILITY TO DIFFERENCES BETWEEN SELECTED 
AVERAGES FOR TRANSFER AND NATIVE 


STUDENTS 


Time Period 


Source of 
i ce 
pees First 
First Three Quarters 
Quarter Quarters 4-6 
Propout of Poorly 
Achieving Students 3,2 3,2 1.5 
Preparation апа 
Coping Effects 3.7 2.2 1.4 
6.9 5.4 2.9 


Total 
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sessed. For example, the validity of the interpre- 
tations given the estimates are predicated upon the 
assumption that the effects of the several factors 
considered remained more or less constant—i. e. , 
operated with more or less the same strength— 
throughout the period studied. There is evidenceto 
suggest such an assumption is not altogether valid. 
For example, there are somereasons to believe that 
standards at the host institution may have been eas- 
ier in the last 2 than in the first 2 years, and atthe 
same time over the period studied the general level 
of the standard may have gone up or become harder. 
While it is difficult to assess precisely the effects 
of these changes on the reported results, neverthe- 
less it is clear that a difference in standard between 
the first 2 and last 2 years at the host school could 
partially account for the relatively large difference 
between the before transfer and the first quarter af- 
ter transfer average for the native group. 


Similarly difficulty results from thefact that in 
the period covered by the study, students in both 
groups dropped out, but not in pairs. For example, 
native students withdrew or were dropped in greater 
numbers before transfer, while transfer students 
withdrew or were dropped in greater numbers after 
transfer. These trends would tend to increase the 
differ ence between the before transfer andfirst quar- 
ter after transfer average for the natives and in- 
crease the differences between averages in the after 
transfer segment for transfers. 


A final complicating factor to be noted relates 
to divisional and departmental differences as they 
affect earned GPA's. From Table2, for example, 
it is evident that the grading standard at the host 
institution is harder than that of the feeder institu- 
tion by half a letter grade and one and a half letter 
grades for verbal and quantitative subjects respec- 
tively. This kind of difference has been noted by 
other investigators (15). These complexities serve 
only to remind one of the difficulty of finding in the 
educational setting data that may be clearly and un- 
equivocably interpreted. For the present study at 
least, the estimated values are of sufficient magni- 
tude to demonstrate an effect of the factors consid- 
ered on the comparative performance of the native 
and transfer students. 


The possible ambiguous interpretation of some 
of the data, however, suggests the need for further 
efforts directed toward the development of more 
precise analytical designs that can increase 
precision of measurement and identify possible in- 


teractive effects. 


However, the present estimates, even though 
approximate, seem potentially useful in understand- 
ing more adequately the contribution of factors un- 
derlying the expected changes in student performance 
following transfer from one school to another. Were 
such information available, for instance, to advis- 
ers ofa given feeder school relative to theseveral 
host schools to which its students transfer, they 
would be able to counsel with students considering 
transfer in a far more effective way than is now pos- 
sible. Also, on the bàsis of such information, school 
administrators should be able to work out more ef- 
fective procedures for minimizing difficulties inthe 
transfer process. 
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FOOTNOTE 


Portions of this paper were presented as part of 
a program on “ The Transfer Student's Aca- 
demic Success’’ at the Annual Meeting of the 
Association for Measurement and Evaluation 
in Guidance, Dallas, Texas, March 22, 1967. 
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EFFECTIVENESS OF INSTRUMENTAL AND 


TRADITIONAL METHODS OF COLLEGE 


READING INSTRUCTION 


RICHARD Р. WHITEHILL and SUE J, RUBIN 
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ABSTRACT 


Introverted and extraverted Ss we 
mance of 88 assigned to the instrumenta 


No significant main elfect was found for the introversion- 


re assigned to traditional and instrumental reading groups. The perfor- 
1 treatment condition was superior to those in the traditional condition, 
extraversion dimension, although the performance of 


instrumentally trained extroverts was superior to that of other groups. 


WHITEHILL AND Jipson (4) demonstrated t he 
role of the introversion- extraversion (I-E) variable 
in college developmental reading program perfor- 
mance. Using instrumental and traditional reading 
program formats, Ss were sortedon the basis of I-E 
scores and their performance increments were com- 
pared. There were two prime findings. First, the 
instrumental program appeared on the whole more 
effective than the traditional program as measured 
by gain scores accrued across allgroups of Ss. Sec- 
ond, extraverts (E) were affected to a greater ex- 
tent than introverts (I) by the program var iable. 
That is, there was little overall difference within the 
introvert group between traditional and instrumental 
programs, while there were significant differences 
within the extravert group on the program variable, 
with the extraverts accruing better gains in the in- 
strumentally based program. The purpose of the 
present paper is to replicate the Whitehill-Jipson 
study with particular attention to the overall pro- 
gram variable. 


METHOD 


Subjects 

The Ss for this study were forty gr aduate and 
undergraduate students at The University of Wiscon- 
sin during the 1969 spring semester. Subject selec- 
tion took place after administration of the Eysenck 
Personality Inventory (EPI) during the first session 
of a free, voluntary developmental reading course. 
The scores on the EPI were divided into thirds; Ss 
ranking in the upper 33 percent ofthe cases com- 
posed the E groups; those inthe lower 33 percent the 
I groups, and those in the middle 33 percent the M 


or «тога? groups. The EPI was given to all stu- 


dents who registered for the course. A test of cur- 
rent reading ability, the Cooperative English Test, 
Form A, was also administered to all students enter- 
ing the program. Ss were matched according to sex, 
age. year in school, college curriculum within the 
university, scores on the EPI, and comprehension 
percentile score on the Cooperative English Test. 
They were then randomly assigned to an experimen- 
tal or control condition; six E's, seven Ps, and sev- 
en M’s were taught by traditional methods; six E's, 
seven I’s, and seven M's were taught by experimen- 


tal methods. 


Apparatus 


The experimental group used automatic reinforc- 
ing clocks which were equipped with a light that went 
out when a given criterion speed on a 500-word pas- 
sage was not met. 


Traditional 8s alternated between the use of aSci- 
ence Research Associates (SRA) Reading Accelera- 
tor and timed readings on stop watches. The accel- 
erator is a device with а bar which moves down the 
page, covering material at а preselected rate. The 
student is forced to increase his speed to read the 
material before it disappears from sight. 


The reading material for the experimental Ss con- 
sisted of paperback books divided into 500-word pas- 
sages. The experimental Ss were in the traditional 
class for their initial three lessons, after whichthey 
were transferred to the operant method and Hiroshima 
by John Hersey. Upon completion of that book, they 
read The Bridge at Andauby James Michener. There- 
after, Ss were free to choose any paperback they 
wished at a level of difficulty equivalentto these books. 
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Traditional group Ss used the texts Increasin, TABLE 1 
ing Efficiency (3) and Maintaining Reading ЕЇ- 
fe 4). These workbooks contain reading se- MEAN PERCENTAGE GAIN IN WPM BETWEEN 


lections on various topics, Each Selection is follow- FIRST AND SIXTH SESSIONS 
ed by ten short answer or objective comprehension 
questions. The difficulty level of the experimental 


and traditional material was judged to be equivalent, Treatment E M i Mean 
Procedure Traditional 159.2 192.6 137,7 163.3 

All students read selection No. lin Efficient Experimental 272.2 210.7 221.1 232.8 
Reading (1) during their orientation Session; this 
measure provided a common Pretest of basal read- Mean 215.7 201 6 119.4 
ing speed and comprehension, The traditional group Е к Е | 
attended two 50-minute classes per Week for as man 
Weeks as desired, They used the traditional texts 
completing twenty selections in Шегеавіп; Readin j | 
Efficiency and a varied number in Maintaining Read- An examination of the performance of Гв, E’s, and 
ing Efficiency until а Speed of 1,000 words Per min- middle Ss in each treatment reveals a similar trend, 1 
ute (WPM), timed on a Stop watch, was easily at- Each group in the experimental treatme: 


itional treatment, The 


е | . experimental) and E-t (ex- 

c S for each correct ravert-traditiona] 83. 4 percent difference between 
answer, A minimal score of 60 Percent was consid- -ex ( introvert-experimental ) and I-t (introvert-tra- | 
ered acceptable in terms of comprehension, ditional), and an 18,1 Percent difference between M- 

Th ex (middle- experimentar) and M-t middle-tradition- 

то кт ы к клы | 
Я е , mean i 4 d : 

as many 500-word Passages as time Permitted, He (201, 6%), and ene pet ain s. 19), — | 

my Wine With a chart of time clock Settings giy- Я east (179, 4%), 

ng rates. If S's reading speed for a Particu- A two-way analysi f i @ 

1аг Passage was slower than the ivei i y аю Ме пареа M 

light went out at the end eats nde we со Аа ШЕ 


of the time eri 
had completed the ралар Period before 8 was computed 


ч < using these data, The Scheffé approx- 
€. If S did not complete the imation was use i 
pooner ae Was instructed to read subsequent раз- each group, Table аше 07 the the pe ncs ned 
Peg : Same rate unti] criterion speeq was met. ance, ere is a significant differen а es 50 а 
ach time Successfully reaq at criterion Speed, he experimental and aditional tr 3 Ce between 
Stopped the clock, recorded his Success, andraised Percentage WPM s ments as far as mean 
the criterion 50 WPM. Bain is Concerned, 


ignificant at the , 05 

Р А : level, igni a кзы 
four 500-word 9н Чава accompanied every in the Leva” Also, thet difference мавтоши 
- =з. Subjects were instructed Eroups did in Bilien US Кех, and M-ex 
to answer each question Upon completion of the cor- dittereat т г таар КОЕР ПШ салу 
responding textual Sections, The questions wer e for the Е- FALL. relationship alo ae 
parallel to the traditional group questions in that the don etn 


dealt with factual recall, inference 


р › and generaliza- 
tion from the reading. The experimenter graded 


RESULTS 


LYSIS ОЕ VARIANCE 

All Ss completed at least six sessions, Individua] 

records were kept of each S's beginning WPM pat D A oss NN 
and final WPM rate on а Session by se S 


Ssion basis, 9 М5 F 

Analysis of pretreatment WPM Scores show no Program 5.09 {| 5.09 4.02* 
Significant differences between any of the 5 groups, iS " 
Analysis of personal data with regardto eandyear 8, 86 2 4.43 0.35 
in school of Ss also did not yield Significant results, Interaction ` 

1,56 

Table 1 shows the mean percentage gain in WPM Е қ i шы 0282 \ 
scores between the first and sixth sessions for each COUP Within қ 
group and treatment. As a whole, the experimental alls 430 34 i A 
treatment groups made more Progress, attainin 520 


233 percent mean gain in WPM as со 


Ба ^ 
mpared to the Totals 19,81 39 ^ 
163 percent mean gain for the traditiona] treatment, * Signifi 
icance leye] of , 05 


87 


WHITEHILL and RUBIN 


As the experiment was structured to hold com- 
prehension quotients constant, no analysis of com- 
prehension scores is presented, 


DISCUSSION 


The results of this study confirm the Whitehill and 
Jipson findings of the overall superiority of the op- 
erant or instrumental versus the traditional strategy 
of reading instruction, That is the instrumental 
method produced greater proportional WPM gains 
across E, I, and M groups than did the traditional 
methodology. These results do not entirely repli- 
cate the Whitehill and Jipson finding that E’s do sig- 
nificantly better in the instrumental program than do 
Ps ог M’s, Although E's did show more gain in the 
instrumental group than any other group in any other 
treatment, the overall difference is not great enough 
toreacha significant alpha level of the program effect. 


The next phase of this research program will in- 
volve assessment of the instrumental versus the 
traditional program over a large number of Ss using 
a correlational design rather than experimental treat- 


шш! 


ment. If reasonable cross validation is found, there 
would seem to be good reason to adopt the instrumen- 
tal approach to speed reading instruction in general. 
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A Guide tor Preschool Teachers 
їп Head Start-Type Programs of 
Compensatory Education 


EDITED BY 
Robert E. Clasen 


О М TO THE CLASSROOM deals 
with typical problems common to 
teachers of disadvantaged preschool 
children and contains unique su; estions 
for understanding and meeting the needs 
of these youngsters. The chapters are 
based on papers by well-qualified pro- 
fessors and professionals from the 
preschool education field which were 
originally presented to a group of Head 
Start teachers саш help in the various 
areas covered. The editor says, “Since 
these works were extremely useful to one 

Toup of teachers, they should be 
useful to others.” 


The book begins with a chapter which 
defines “culturally deprived" and offers а 
frame of reference for the thoughts and 
ideas presented in the remainder of 

the book. Each chapter was selected 
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has shown that teachers need? ON TO idea shared through this medium may stimulate a change 
behavior for the benefit of a child.” 
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TWO GENERALIZATIONS OF THE ITEM 


DISCRIMINATION 


INDEX 


TO MULTI-SCORE ITEMS 


DOUGLAS R, WHITN 


EY and DARRELL L, SABERS 
University of Iowa 


University of Arizona 


One based on 
limitations of 


em than the << 
ion on which discrim- 
ed is not available for ац individuals 
aminer, For example, total test 
i for a theoretical con- 
em discrimination 
Serve as q Partial substi- 


Struct measur, igh Positive it 
em wil 


In ord it i : 
dti necessary 
to first define item « x 


ifficulty” in the multi-score 
case Seneral дег nition of difficulty is how‘ hard” 
the ite is. That is, how Оез th 
an item 


t е performance on 
у а group о examinees com e with the 
highest Possible level of ре: кае 


rformance, The perfor- 
minimum Possible 
Possible leve 
lar manner 


Score on the item. The h ighest 
lof performance, expressed in a simi- 
> Would be the qj erence between maximum 
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and minimum possible item scores. The generally 
accepted convention of expressing difficulty as a 
percent will be used for this index. Thus, ageneral 
index of item difficulty is 


pax 943 cx 2110) 
min max min 


(1) 


where P is the item difficulty in percentage units, 
X is the mean item score of n examinees, X 

max 
is the highest obtainable item score, and X is 

min 
the minimum obtainable item score. P represents 
the percent of maximum performance achieved by 
the group. The range of P is from 0 percent (when 
all Ss earned the minimum item score) to 100 per- 
cent ( when all 8s earned the maximum item score). 
For items with a minimum score of zero, thein- 
dex simplifies to [ x/X ] x 100, and becomes 
max 
the percent of examinees answering the item cor- 
rectly for а dichotomous item. When the traditional 
correction for guessing formula is applied to a mul- 
tiple choice item, X --І1/(г-1), wherer is the 
min 

number of item responses. 


Suppose а multi-score item has been com- 
pleted by twelve students, and that the scoring pro- 
vided for integer scores of 0 to 3 inclusive. The 
following data might have resulted: 


Item Number Earning Total Points 
Score Score by f Students 

(X) (9 їх 

3 3 9 

2 2 4 

1 2 2 

0 5 0 

N=12 Xix-15 


The average item score (X) is then Z£X/N-1. 25. 
For this item X is 3 and X is 0, so that P- 

max min 
[(1. 25-0)/ (3-0) ] x 100, or about 42 percent. 
That is, this group earned 42 percent of the avail- 
able points on this item. 


The interpretation of item discrimination as dif- 
ferential difficulty leads to the definition of item 
as the difference between item 
difficulties (as defined above) for two examinee 
groups. Since many previous discrimination indices 
have been expressed as decimal fractions ranging 
from -1 to 41, that convention will be used her e. 
This index representing differential difficulty has 


discrimination (D; ) 


the form: 
p-P-P-(X Gy x) 
1 L U L max min 
(2) 
where P and p are the difficulties (as in formula 


L 
(1) ) for the upper and lower groups expressed as 


proportions, X апа Х аге the mean item scores 
U L 
for the upper and lower groups, and X and X 
i max min 
are as previously defined. 


Illustration of the Computation of D, 


Suppose that a multi-score item with possible 
scores 0 to 3 inclusive had been completed by twelve 
examinees in each of two criterion groups. The fol- 
lowing data might have resulted : 


Number in Total Number in Total 

Upper Group Points Lower Group Points 

Item — EarningScore Earned EarningScore Earned 
f X 


Score í f {xX 
(X) U U L L 
3 5 15 1 3 
2 4 8 1 2 
1 2 2 4 4 
0 1 0 6 0 

n -12 гух = 25 ny=12 {px =9 


For this example, Xy = 25/12 = 2,083, x = 9/12 

=3andX -050 that D, = ( 2, 083- 
max min 

0. 150)/3 or about . 44, That is, the upper group 

earned about 44 percent more available points than 

did the lower group. 


20.75, X 


Discrimination as ]tem-Criterion Association 
An alternative definition for discrimination is 
the degree to which item performance is consistent 
with criterion performance. That is, the extent to 
which individuals who differ on the criterion mea- 
sure differ in a similar manner on their item re- 
sponse. In most testing situations, item perfor- 
mance is expected to be related positively to crite- 


rion measures (e.g; total test score in the usual 


case). The degree of consistency of these perfor- 
trated via some suitable 


correlation-type index. An alternate operational def- 
inition of this property is the net. proportion of all 
possible subject pairs which show à sitive rela- 
tionship between item Score à 
This approach was introduced by Findley (2) as an 


the two approaches yield equivalen 
chotomous items, generalization of Findley’s ap- 
proach leads to an index different from рү4ог multi 


score items. 


For any number (N ) of examinees, the number 
of possible unique pairs of Ss is N ( N-1)/2. I these 
Ss are grouped into c criterion groups (of sizes n,, 
йәә LO a UN), tbosedndividuals withinthe 

j=l j 
same criterion group can no longer be differentiated. 
That is, since their criterion measures are the same, 
there can bẹ no basis for discriminating among 55 
within {һе same criterion groups. Hence, these 
pairings cannot evidence either positive or negative 
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iation with item response. Subtracting these 
ж en € (n (n -1) /2) from the total number 
е Ж А А 
m4 d 4 
of possible абе gives the maximum number of 
pairs which could show positive item-criterion as- 
sociation, It can be shown that N(N- 1)/2- 


с 
a 
j= 1(nj (nj - 1) /2 is algebraically equivalent to both 
еі с 
2-5. па 2 > x HEN 
1/2 (N "ES ni) and e Біз n e 


latter formula is usually preferable for computation- 
alpurposes. This quantity, the maximum number 
of positively discriminating pairs for fixed criteri- 


on group sizes, will be denoted by D max: 


The number of net positive discriminations may 


dering each of pairings in- 


erion group and one S from 
another group. Each pair in which the S fr oma 


into matrix form as in the followi 
diate sums are calculated, 


D: = (D -D 

= беш Er M (з) 
where D, represents number of positive associa- 
tions, D. represents the number of negative asso- 
ciations, and В тах represents {һе maximum num- 
ber of positive associations, 


Illustration 


of the Calculation of D. 
10 Elculation of p, 


If item data are put into matrix form, with the 
ordered criterion groups as columns (column 1 is 
highest and column с is lowest ) and ordered item 
sources as rows (row 1 is highest item score and 


r then 


For dichotomous items, Р. 
duct of the number of examinees i 
who got the item correct and th 
nees in the lower group who go 
Similarly, D_is the product of the number of upper 
group incorrect responses and lower group correct 
responses, Е simply n?, where n (as above) 
represents the number of examinees in each group, 
Only for items scored 0 or 1 and using 27 percent 


criterion groups is this index equivalentto Johnson's 
U-L index, 


t the item incorrect, 


Again, the data below might have occurred 
for an item with scores from 0 to 3 inclusive, In 
this example, there are three ordered criterion | 
groups, each containing ten Ss ( although equal group 
sizes are used in all examples, neither index requires | 
equal n's). | 


Item Frequency Frequency Frequency 
Score Upper Middle Lower 
3 5 3 0 | 
2 3 3 2 | 
1 2 3 4 | 
0 0 1 4 | 
| 
Total 10 10 10 | 
Неге, 


Т 
Dy = 5($+3+1+2+4+4)+3(3+1+4+4) + 2(1+4) | 
+3 (2+4+4) +3(444)43(4) = 197 | 


D. = 0(3+3+1+3+2+0) +2(3+1+2+0) + 4 (140) 


+3 (84240) +3(2+0)+3(0)=37, and | 
Dmax? © (c-1) n*/2 = 3(2) (100) /2- 300 since the 


“ 
criterion groups each con 


aminees, D, = ( 197-37) 
example, That is 


» 93 percent more — — 


ion association than | 
Showed negative association, Specifically 65, 7 per- | 


Cent were positive and 12,3 percent Were negative 
(the rest being neutral or non-discriminating), | 


DISCUSSION 


od 
), when applied to а 
сгіёегіоп groups, are equi" | 
ней. indices, Та fact, the discrimi- | 
nation indices р, and D, are identical for this type 
of item, However, thig equivalence does not hold in 
general for oth » and most of the dis- 
Cussion in this i i 


ferences betwe 


Disc 


rimination ag Differentia] Difficulty (Dy) 


The major advantay 
to be the Concept 


Se of this approach appears 
index D,, 8 


ual and 
th 


have access to or 
other mechanical aids, a desk calculator 


me groups, may һауе сег” 

it S eheralization of results (see 

n реет, it may be desirable to have a dis~ 
ex to use with more than two criterio? 


гра OWever › 
criminatio 


*-— 
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groups. It would be possible to use some weighting 
of criterion group mean scores and thus obtain an 
index of differential difficulty similar to D, , but 
much of the conceptual simplicity would be lost and 
the subjectivity of weighting would have been intro- 
duced. А second limitation, the fact that D, is 
score-related, is discussed below. 


Discrimination as Item-Cr iterion Association (D) 


The major advantages of index D, ава measure 
of discrimination are almost exactly the weak points 
of D, and vice versa. First, D; is independent of 
the score values assigned to item responses. Spec- 
ifically, the scores attached to the levels of item 
achievement have no effect on the value of D. This 
characteristic is appealing in that D; is dependent 
on the ability of the item to distinguish betw een 
achievement levels and is not affected by the scores 
attached to these levels. 


A second advantage of D; is that it may be eas- 
ily applied to situations which are not generally in- 
terpreted as test ** items. » An example of thisuse 
might be for aptitude test scores ( where perfor - 
mance on the test is considered as the ** Шет” ) 
used to distinguish between levels of success ona 
job. Another example might be in analyzing thede- 
gree of agreement between two raters ог judges. 


A third advantage of D, is that it is closely ге- 
lated to Kendall's ( 6 ) rank correlation tau. D, is, 
in fact, the product of tau and a function of the ob- 
tained response and criterion frequencies. Although 
not identical to Kendal!'s coefficient, Da is related 
and hence also related to Cureton's (1) and Glass' 
(3) rank-biserial correlations. Further, an ap- 
proximate ( asymptotic normal ) significance test 
and a randomization test based on Kendall’s S (here 
s=D, - D_) are available. 


One limitation to the use of D; is that the use 
of more criterion groups than there are item re- 
sponse categories necessarily limits D3 to some 
value less than unity. Inthe following example, the 
prescribed procedure for obtaining Dmax indicates 
540 possible positive pairs. Since а positively (or 
negatively) discriminating pair requires differing 
criterion and response categories, however, the 
maximum number of pairs with differing responses 
would be ( 18) (18) = 324. The limit on D, for 
this item would be 1р, = 324/540 = . 60. 


Item Frequency Frequency 
Score Upper Lower 
Sixth Sixth 
1 s 6 6 0 о 0 18 
0 0 0 0 6 6 6 18 
Total 6 6 6 6 6 6 


In general, for an even number of criterion 
s of equal size, ID, $c (r-1Y г (c-1) for 
, The limiting value for an odd number of cri- 
terion groups is somewhat less than this value be- 
cause of the necessity of splitting the middle cri- 
terion group frequency in order to achieve the max- 
imum number of discriminating pairs allowed by r 
response groups. When г < с, the use of с crite- 


group 


rion groups is, in effect, asking the item to make 
finer discriminations than its response categories will 
allow. Although D, may still be used in these cases, 
the values will generally be low when compared to 
cases for which c £ г. 


Generally, the greater the number of criterion 
groups employed, the finer the discriminations ex- 
amined іп D2. That is, the inclusion of the inter- 
mediate group їп the example made a lower value of 
D, more likely (as compared to D; =. 78 if only the 
upper and lower thirds had been used ) because it in- 
troduced a group with a larger probability of misclas- 
sification. For this reason, obtained values of D; are 
dependent somewhat on the number of criterion groups 
employed and should only be compared with 0, indi- 
ces for items using similarly defined criterion groups. 
Jt has been shown (5) that using extreme groups min- 
imizes the possibility of criterion misclassification 
relative to group size. 


SUMMARY 


D, and D; do not necessarily give identical values 
for the same data, nor do they yield the same infor- 
mation. There is no reason to expect identical re- 
sults except that both reduce to the conventional U- L 
index in the case of dichotomous items and two cri- 


terion groups. 


The D, index is easily computed and understood, 
put its value is dependent on the item scores employed. 
D, is perhaps applicable to a wider variety of testing 
situations, but is computationally complex. Both in- 
dices are related to the slope of the item character- 
istic curve for the item and criterion employed. D, 
is a direct estimate of the linear slope of this curve 


as а measure of item-criterion association, 18 also 
Its exact 


A study to determine which of these ( and other 
alternative ) indices is mos! i 
poses is underway. Specifically, their usefulness for 
selecting maximally reliable and 
items from an item pool will be investigated for items 
which are 6-point ratings. 
D values for items scored with and wi 
tibn for guessing will be made, and compu 
tion of essay item scores will be used to estimate the 
sampling distributions of the indices. 


REFERENCES 


1. Cureton, Edward, E., “ Rank-biserial Correla- 


поп," Psychometrika, 21:287-290, September 
1956. 


2. Findley, Warren, G., «Rationale for the Evalua- 
tion of Item Discrimination Statistics, ” Education- 
aland Psychological Measurement, 16:175-180 

Summer, 1956. 
3. Glass, Gene, V., “A Ranking Variable Analogue 


of Biserial Correlation: Implications for Short- 
cut Item Analysis,” Journal of Educational Mea- 


surement, 2:91-95, June 1965. 


THE JOURNAL OF EXPERIMENTAL EDUCATION 
92 


— t Е 
Johnson, A. Pemberton, Notes on а Suggested and Lower Groups for the Validation of Tes 
ч Index of Item Validity: The U-L Index, ” Journal 


50 Items,” Journal of Educational Psycholo , 90; 
of Educational Psychology, 42: 499-504, Decem- 17-24, January 1939, 
ber, 1951. 


6. Kendall Maurice, G., Rank ано Methods 
Я қ : 
5 Kelley, Truman, L., “ The Selection of Upper Griffin Co., London, England, 1948, рр. 


It presents the essentials—the meth- 
, and the key Concepts of the field, In his preface Cronbach states, ‘This 


Бе can grow," He Stresses that new 


acknowledgements that reader 5 ha va 
S third edition this would be а gross error, He has used the occasion d 
ге of his three editions and + 


more convinced than, 

© ofaptitude-treatment interactions. 

more frequent and more on-target, In fact, he 

Width-fidelity Dilemma, ? «Но 

“Obsolescence of Norms, ” and “The Signal-noise Ratio, » Among the new Sections that have been added are: ^ 

‘Evolution of the Testing Ei Xpectation of Failure, ” and “Testing in Developing Nations.” He 
placing a 1960 Section entitleq “D 

with “Development of an Aptitude 


, 
evelopment of i itude Test" | 
Test for Computer Programmers,» ^ 8 Stenographic Aptitu 


о іп a more tentative voice 
Uthoritarian “How to » a 
Sra peau ft ag ret ese чш eea ant 
. ersonality Measurement»; 
of sound, Although the materia] is largely the sa; 
more cautious re-titlin, 
Self-description as a Report of Т 
description: Report of Typical 


| through a cei | 

r no longer has the meter-stick preci 

complexity of the subject is conveyed through Cronbach'$, f 
t ersonality, > reas he c 

Ypica] Behavior, "hen 


5 e d for t * The 
OW more Questioning]y hag titled mony preseni 


€d the section, «The Self- | 
The amount of attention given to the I 


has been sh 
upon research and theory Concerning ability test ч мы reduced 
and tests for infants and preschool, , he has devoted givi 
vergent and divergent thinking, the Social i i esis, and 4 a 
formance, Surprisingly, > 
work in the field, but rath 


minated, not 


because of lack of current 
beyond the bounds of treatment in а Бепега1 text. 
Cronbach’s usual high standard of Scholarship is evident thre i 
static condition of the field, a comparison of the Second ang third оао. жые = eni attations 60 odi lon 
has been obsolesced, Reader, take note, "t evident шіні 0 
Professor Norman R, Stewart, Reviewer 
chigan State Universit 


| THE JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 39, Number 3, Spring 1971) 


"неу 


А MULTIPLE REGRESSION APPROACH 


TO MULTIPLE COMPARISONS FOR COMPARING 


SEVERAL TREATMENTS WITH A CONTROL 


JOHN D. WILLIAMS 
The University of North Dakota 


ABSTRACT 


ed for the multiple comparison situation in which several treatment 
Using a regression approach, it is shown that the multiple comparison 
duct of the general regression program; the test de- 
the test of significancefor the partialregression weights. 


A multiple regression approach is us 
groups are compared to a control group. 
procedure described by Dunnett can be found as a by- pro 
scribed by Dunnett yields identically the same results as 


| 
RECENT efforts (notably Bottenberg and Ward ple comparison procedure for comparing several 
(1) and Jennings (5)) have been made to present mul- treatments with a control, often referred to as Dun- 
V tiple linear regression as а problem-solving tech- nett’s (3,4) test. It should be made clear from the 
| nique, Ward(8) has compared f our different ap- onset that the present effort does not purport to ex- 
proaches to problem solving: analysis of variance, tend Dunnett's test, 
multiple regression, analysis of covariance, and a ear regression, to 
with considerably less effort. 


| technique called VARICO -a “ sortof reverse covari- 
ance analysis." Ward showed that, while the meth- meaning to the testing of significance for the partial 
ods differed conceptually, the four approaches have regression weights. 
many basic ideas in common. The difficulty of rec- 
ognition of this situation, i.e., the relationship that To show the relationship between a regression ap- 
proach and the usual analysis of variance approach, 


enthe usual analysis of varianceap- 
a general data- an example using sample data is first subjected to 
the computations of analysis of variance, and then 


Dunnett's test is run. The problem is then reformu- 
lated from a regression viewpoint, and finally com- 
parisons of the two methods can be made. 


exists betwe! 
proach and multiple regressionas 
analytic system has been discussed by Cohen ( 2 ). 
Jennings (5) discussed at length a 2x3 fixed effects 
analysis of variance from à regression viewpoint. 
Both Jennings and Ward have extensively used a bi- 


nary coding to effect a solution. 
ы қ АМ ЕХАМРҺЕ 
One criticism that has been made of multiple re- 
h, as compared to the analysis of The following data are presented for analysis: 


gression approac 


variance approach, is that while the analysis of vari- Control 

ance can be duplicated by а regression analysis, no Group Group 1 Group 11 Group 111 
real advantage is gained. Without discussing this crit- 

icism in detail (Cohen (2) has already done so), this 9 E M 15 
article shows an additional conceptual usefulness of 8 8 12 15 
a multiple regression approach by focusing on a par- 3 6 n 17 

„7 ticular application of a regression approach to the 4 6 14 n 

The regression Y 560 *, 210 X $25 3, — 


problem of multiple comparisons. 
approach is conceptually simple and yields a multi- 0 
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TABLE 1 


SUMMARY TABLE FOR COMPARING SEVERAL 
TREATMENTS WITH A CONTROL 


ee 


Source of Sum of Mean 
Variation df Squares Squares F 


Among groups 3 


185.00 61,67; 13. 333 


Within groups 16 14.00 4.625 
Total 19 259. 00 


The first group has been labeled Xo and serves as 
the control Sroup to which all other Sroups are com- 
ir: 


Dunnett’s test is a test 
With a contro], The mean. 
Xp, where the Xs are 


=й» (m - ) 
2017797 (m - m 
eee M а) 
AR 
NN 
Using MS ,, as the estimate of 5%, and assuming all 
groups areof equal size with the hypothesi, - = 
0, then (1) can be reduced to pun age 
5-% 
Ca 
2(Ms,,) (2), 
n 


Dunnett's original article develops 
for each Comparison; a Critical difference aj 
can also be used so that the compari; аас 
quickly: 

2(MSy) 


с.а. = {Е >= 


where t is taken from tables Prepared by Dunnett 3 
4). While the critica] difference method ІЗ 


Тһе іһгее comparisons to the Control group can 


be effected by equation (2). 


7.0 - 6.0 
Q0 0 


Ж 
1 
2 (4.625 
( ) 


= 2/06 


Similarly, t, =4 411 and ©, = 5.1471. Using Dun- 


are both significant at the 
01 level, while 8 is not Significant, 


А REGRESSION APPROACH | 

Оп the other hand, the Problem can be viewed | 
from а regression viewpoint, It is helpful to define 
four binary Predictors: 


nett’s tables, t and t 
2 3 


7 


Х =1И{һе Score is from a member of the control | 


Eroup; and 0 otherwise 


7 lif the Score is from a member of group 3;and 
* 0 otherwise 


A linear model can be written for this situation: 


Y = bo + bii + охо + bax, +e | 
where 
bo = the Y-intercept | 


1={һе regression Coefficient for group 1 
à = the regression Coefficient f, 


PODS O~-—4s“o 990506000 
а овоо 


| 
ge 
— с 
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| 
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. OUTPUT OF MULTIPLE REGRESSION PROGRAM 


TABLE 3 


Variable Standard Correlation Regression Std. Error Computed 
No. Mean Deviation X vs Y Coefficient Of Reg. Coef. t Value Beta 
3 0.25000 0.44426 -0.40109 1.00001 1.36014 0.73522 0.12033 
4 0.25000 0.44426 0.40109 6.00001 1.36014 4.41130 0.72197 
5 0.25000 0.44426 0.56153 7.00001 1.36014 5.14652 0.84230 
Dependent 
1 9.50000 3.69210 
| Intercept 5.99999 
Multiple Correlation 0.84515 
Std. Error of Estimate 2.15058 
= 
Analysis of Variance for the Regression 
Source of Variation Degrees Sum of Mean 
Of Freedom Squares Squares F Value 
Attributable to Regression 3 185.00027 61.66675 13.33340 
i Deviation from Regression 16 73.99973 4.62498 
^ 
Total 19 259.00000 
It can be noticed that the control group has seem- Y= E + 0q » X9 +i E Xo)Xe + (л) 


ingly been left out. However, if this equation is sol- 
ved for an expected value for a member of the con- 


trol group, 
Ба) = Бо + 6100) + 50) + b3(0) 


.E(Y) = bg. The expectancy for a member of thecon- 


(5; - Xo) X * e. 

Equation (7) lists precisely the comparisons of 
interest for comparing several treatments with а 
control. Since equation (4) (and, therefore, equation 
(7)) is the full model for the expression of aone-way 
analysis of variance, this approach also yields re- 


> 
á trol group will by definition be X.. Thus, a least sults identical to the analysis of variance situation. 
=> 0 Thus, using equation (4), it can be seen that these 
squares solution for by is Хо, the mean of the control two useful results can be obtained simultaneously: 
group. the usual analysis of variance as one part of the out- 
put, and Dunnett’s test as the other part. 
For a member in group 1, the expected value 
would be The information necessary for а regression solu- 
tion, with equation (4) as the linear model, can be 
E(Y) = by * 50) * 6,00) + b3(0) conveniently placed in tabular form (see Table 2). 
For the data in Table 2, the general purpose 
p^ E(Y) = bg + by multiple regression program was used. Table 3 соп- 
(5) tains the printout from that analysis. The variable 


number in Table 3 refers to the order in Table 2;the 
criterion variable is variable number 1; variable 2 
refers to the control group, variable 3 to group L 


E(Y) = Xo + b1. 


A least squares solution for the expectancy ofa 


given member of group 1 is the mean of group 1; 


Thus 
XX, b 
1 0 1 from equation (5),ог 
д -Xo = bj: (6) 
Likewise = 


bo = Xp - Хо and bg = X - Xo- 


Equation (4) can be rewritten 


variable 4 to group 2, and variable 5to group 3. 
Because variable 2 refers to the Control Group, no 
information appears in the printout using this vari- 
able number, The table of residuals has not been in- 
cluded herein, 


Table 3 contains the previously mentioned items. 
It can be recalled that X y= 6.0, X} = 7.0, X - 12.0, 


X = 13.0. The intercept is 6. 0 ( within rounding er- 
з 


гог) апа іѕ Xo: Also by = 1 =X 


= хапа is 
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in keeping with equation(7). Similar statements could 
be made concerning b, and b з. Of more interest 
for this particular presentation is that the computed 
t values are identical (to three decimal places) to the 
t values found by using equation(2), which is Dun- 
nett’s test for equal-sized groups. 


Dunnett’s (3,4) tables are necessary for preser- 
ving the probability level. Ifthe experimental groups 
are not of equal size, then the test described by Dun- 
nett, and therefore the present formulation, results 
in an approximate test. 


DISCUSSION 

A major reason for using multiple comparison 
procedures has been to make individual comparisons 
of means and to simult 


aneously preserve the proba- 
bility level. One of Several multiple com i 


НН Xe + by X + by Xp + by Hy + (8) 
The mode! 


1 given in equation (4 
have the same к: ana 


notation, with ee) 


these exceptions: 

U= a unit vector (i. e. 
ing 1’s), 

be = the regression со 

X= the control group 

е] = the error involved in Prediction for 
To test the hypothesis that 


the control оир а: 
group 1 are equal, the following restriction can bs 
made: 


» а predictor vector contain- 


efficient for the control group 


equation (8) 


b, = by (This is the same hypothesis аз х, =X, j 
Then equation (8) can be rewritten; 


ама ылы Ты” 

where 

ay = the regression coeffici 
groups of Xo and Xi 

e 


7 the error involved in Prediction for equation (9) 
2 


ent for the combined 


Comparing equation (9) to equation (8) in the 
methodologies of Kelley and others (6), identical] 
the same result (Е = . 5405 = (.185):) 18 achieved. 
asis іп Table 1, Sim 


ilar procedures would yield the 
other comparisons, 


SUMMARY 


The present paper has presented a Specific appli- 


cation of multiple regression as a problem-solving] 
technique to the problem of multiple comparisons. Ra 
sults of an analysis of variance and the subseque 
multiple comparisons of several treatments with a 
control are given. When using the regression ap- 
proach presented herein, those same nompanison d 
can beread directly from aregression printout, whic! 
illustrates that Dunnett’s test can be conceptualized 
as atest of significancefor a partial meeresstong 
weight. This is accomplished by setting the regres 
sion coefficient for the control group equal to zero in 
a linear model. This is done by not including the vet 
tor for the control group in the prediction equation 
(linear model), Effectively, if the researcher 
wishes to make comparisons of several treatments 
witha control, he needs only to binary code the group 
membership and use the resulting binary coded vec" 
tors ( not including the control group binary coded 
vector ) as predictors (as demonstrated herein), The 
computed t test for each partial regression weight 
is identical to the test Dunnett suggested for compar 
ing several treatment groups toa control group. Thus, 
no additional computations need be made, either by à 


calculator or by additional computer runs for this 
situation, 
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FACULTY ATTITUDES TOWARD UNIVERSITY 


ROLE AND GOVERNANCE: 


A FACTOR ANALYTIC APPROACH 


L. ERWIN ATWOOD and KENNETH STARCK! 
Southern Ilinois University, Carbondale 


ABSTRACT 


ole of the university faculty member (7), little inquiry has been madeinto faculty 
Utilizing a 29-item instrument devised by the Education- 
al Testing Service (ETS), this study sought (1) to determine what faculty members regard as some of the issues 
and (2) to identify any existing patterns of opinion. The sample consisted of 132 interviews with faculty members 
of a large Midwestern university. R-analysis isolated four dimensions of concern. Two centered on aspects of 
freedom and control within the university: the teaching versus research dichotomy, and university involvement 
in societal concerns. Q-analysis yielded three basic opinion types. These included concerns for academic free- 
dom and control, social activism of the university, and research versus teaching. Multiple linear regression in- 
dicated that predominant demographic characteristics associated with opinion types were political orientation, 


type of job, and length of time teaching. 


Despite the changing г 


EXAMINATION of attitudes of university fac- PURPOSE 


_» freedom, and faculty authority; 


ulty members toward various issues has been the le- 
gitimate concern of educational investigators for some 
time. Ranging from rather crude measuring devices 
yielding frequency data to elaborate instruments de- 
Signed to utilize powerful statistical techniques, the 
studies generally deal with a single concern, suchas 
classroom behavior (3) and liberal versus profes- 
sional education (2). A few studies, including the 
Faculty Attitude Survey (9), have sought to chart the 
dimensions of ‘faculty morale" and ‘‘satisfactions 

_, апа dissatisfactions’’ (10). Evenfewer attempts have 
been made to delineate faculty perceptions of issues 
that have assumed paramount importance in higher 
education of the 1960’s, namely, the definition of the 
role and regulation of the university. 


. Indications are that the rapid growth of theuniver- 
sity has produced a change in the role of the faculty 
member. Generally, the change has diminished his 
autonomy and made it impractical for him to partic- 
ipate directly in policy making (7)- Graybeal (4) 

reported results of a national survey in which college 
апа university faculty members were asked about in- 


Stitutional practices involving promotions, academic 
however, apparently 


particular issues to 


no attempt late 
pt was made to re titude pat- 


908 another or to identify at 
erns among the respondents. 


The purpose of this study was to determine how 
faculty members perceive certain selected issues 
centering on the university's role in society and the 
regulation and controlof the university as represented 
by a set of twenty-nine statements. Further, anat- 
tempt was made to explore the relationships between 
opinion patterns and such demographic chacteristics 
as age, political orientation, and type of job. 


METHOD 

Instrument 

the ETS released results of a na- 
h dealt with college and university 
boards of trustees (8). Besides providing for data 
on the role of the trustee and personal background 
information, the questionnaire developed for the 
study included twenty-nine Likert-type statements 
designed to examine the trustees’ perceptions of the 
role of higher educational institutions and their gov- 
ernance. Respondents were asked to check one of 
the five alternatives that best represented their feel- 
ings about each statement. The alternatives were: 
strongly agree, agree, don’t know, disagree, and 
strongly disagree. 


ghteen statem ents were to be regarded 


In early 1969 
tional study whic! 


The first ei 


2 


spondents as applicable to their institutions; 
Балы eleven statements were to be regarded 
in terms of higher education as a whole. These twen- 
ty-nine statements, with minor adaptations, together 
with twelve demographic items, comprised the in- 
strument for the current study. The statements ap- 
pear in Table 1. 


Sample 


A simple probability sample of 220 faculty mem- 
bers was drawn from the faculty directory oí a Mid- 
western university which at the time of th e study— 
Spring 1969—had an enrollment of about 20,500, Only 
the academic ranks of instructor, assistant and asso- 
ciate professor, and professor were included. „Тһе 
sample included a number of persons whose primary 
duties were not necessarily instructional inasmuch 
as nearly all administrative and many service per- 
sonnel carry academic rank within one of nine aca- 
demic units at the institution. Further, the faculty 
directory did not specify the individual’s primary du- 
ties. The population numbered eight hundred. Of the 

220 personal interviews assigned, 145 were complet- 
ed; of these 132 questionnaires, or 60 percent of the 
original sample, were usable. 


By academic unit there was little discrepancy be- 
tween the number of respondents and non-respondents, 
e.g., 44.6 percent of the respondents were inthe Col- 
lege of Liberal Arts and Sciences, which has 46.6 
percent of the entire university faculty. Of the re- 
Spondents, 60.7 percent held doctoral degrees com- 
pared with 53.3 percent for the population. By facul- 
ty rank, the respondents underrepresented professors 
and associate Professors, 20.8 percent and 13.1 per- 
cent respectively, as compared with 33.0 and 27.5 
percent for the population. Overrepresented among 
the respondents were assistant professors, 37.7 per- 
Cent versus 31, 8 percent for the population, and in- 


Structors, 28.5 percent versus 7.6 percent for the 
population. 


As for the non-respondents, the percentage of wom- 
en not responding, 22.1 percent, was hígher than the 
percentage of women in the total population (15. 4 per- 
cent). The number of associate professors and pro- 
fessors not responding, 20.0 and 28.4 percent re- 
spectively, was somewhat higher than for those from the 
total population responding, 13.1 percent for associ- 
ate professors and 20.8 percent for professors, 


Data Treatment 


een all pairs 
of respondents (Q-factoring), Anopinion pattern was 
defined а: enty-nine statements 
that represented the pattern а respondent felt was 
TOR representative of how he felt about the state- 
ments. i 


› for the R-factoring, 
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i an additional twen- 
nated at random; for Q-factoring, | RE i aS 
ty-one respondents were dropped in 222 алан 
with the restriction that none of the twe uw dee 
research as his primary responsibility. Eds 
of the 132 respondents fell into the research cates 


"ded : ans 
Multiple linear regression (6) afforded ү Lie ani 
of evaluating relationships between opinion Ун Ix 
demographic characteristics. Factor ios. ы егерме 
each opinion type servedas criteria, with de m OE Pe 
variables serving as predictors. In each case mare 
were tested for curvilinearity and interaction e dm 
before the linear models were analyzed; none wer | 
found in the models tested. 


RESULTS 


Issues 


For the R-factor analysis a principalaxi solution. | 
was used with rotation to orthogonal (Уагїтах) sin 
ple structure. The minimum eigenvalue for lanar 
ing was 1.0. Three distinctdimensions emerged with 
a fourth appearing to be related to one of the three. 
When a 3-factor solution was requested, Factors ! | 
and 4 collapsed into a single dimension without chang 
ing the interpretation of Factors 2 and 3, Correla- 
tions among all pairs of statements and the simple 
Structure factor matrix are given in Tables 2 and 9. 
While Factor 1 represented a more generalized con^ 
trol dimension, Factor 4 was concerned primarily 
with faculty freedom and participation in the deter- ы 
mination of university policy. For example, these 
two statements best represented Factor 4: 


There should be faculty representa- 


tion on (name of university ) govern- 
ing board. - 


(Name of university ) faculty members 
Should have the right to express their 
opinions about any issue they wish in 
various channels of university commu- Ж 
nication, including the classroom, stu- 


dent newspaper, etc. ‚ Without fear of 
reprisal, 


These two statements best represented Factor 1: 


The (name of university) administra- 
Чоп should exercise control over the 
content of the student newspaper. 


Attendance at (name of university) is 
à privilege, not a right, 


in 
Factor 2 appeared to revolve around the дейтш 
ly endless conflict between those with teaching, iei 
opposed to research, orientations. Although pm 
ing a slight concern for curriculum, Factor 2вее 
ed best represented by these two statements: 


The value of the PhD (EdD) is over- 
emphasized in recruiting a faculty at 
(name of university ). 


Teaching effectiveness, not publica- ч 
tions, should be the primary стер 
for promotion of faculty at (name 0 
university ). 


jon 
" М imensio 
Factor 3 was one of social concern. This dir 
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TABLE 1 
Z-SCORES FOR ALL OPINION TYPES FOR EACH OF THE TWENTY-NINE STATEMENTS* 


No. Statement Opinion Type 
п ш 

1; Attendance at(name of university )is a privilege, not a right. 0.0 4:3 0,7 
2. In making adm issions decisions, academic aptitude should be the most important 

criterion (i.e., given the greatest weight) at (name of university). 0.8 0.5 -0.8 
3. (Name of university) faculty members should have the right to express their 

opinions about any issue they wish in various channels of university commu- 

nication, including the classroom, student newspaper, etc. , without fear of 

reprisal. Let OF 027 
4. The (name of university) administration should exercise control over the con- 


tent of the student newspaper. -1.7 0.6 -0.9 


—À 
5. All campus speakers should be subject to some official screening process. -1.6 0.5 0.1 
There should be faculty representation on (name of university) governing board. 1.7 0.2 0.2 


Students who actively disrupt the functioning of (name of university) by demon- 


Te 
strating, sitting in, or otherwise refusing to obey the rules, should be expelled 
> or suspended, -0.3 1.4 1.2 
8. The grading system now in use at (name of university) needs to be modified. 0.4 -0.4 0.1 
hg 9. An active research interest is a prerequisite for good undergraduate teaching. 
‘A man who does no research on a subject soon becomes less qualified to teach it. 0.4 -0.6 -2.0 
| 10. The value of the PhD or EdD is overemphasized іп recruiting faculty at (name of 
university ). -0.8 -2.3 1.2 
11. (Name of university )should be actively engaged in solving contemporary social problems, 1.0 0.8 0.4 
12. Teaching effectiveness, not publications, should be the primary criterion for қ 
promotion of faculty at (name of university ). 0.4 0.1 1. 
13. (Name of university ) should serve as a cultural center for the population in the а x5 d 
к surrounding area. ` 3 М 
14. (Name of university ) curriculum should be deliberately designed to accommodate «B uy ns 
a wide diversity in studentability levels andeducational-vocational aspirations. д . . 
15. (Name of university) should be as concerned about the personal values of its єй Ай 
students as it is with their intellectual development. 
3 i i s should 
16. Students involved in civil disobedience off the (name of omne e дё -0.2 -1.3 
be subject to discipline by the college as well as the local au A 
i i doftrustees. 0.6 -0.9 -0.3 
- 17. There shouldbe more professional educators on (nameof university ) boar 
" i is that of mediator 
[ 18. The more appropriate role of the (name of university) president is th 20.5 -2.3 -2.1 
ү rather than leader. " 
i i ho seeks 
19. There should be opportunities for higher education available to anyone W! ETET 
education beyond secondary school. METET 
i is reasonable. j " 
20. The requirement that а professor sign à loyalty oath is rea: 
" ily preclude а 
3 igi nitment does not necessari ке e. ді 
"s A definite institutions! enis o alternative views nor prevent free inquiry anı 6:6: 0,8 “08 
genuine exposure of the studen 
-0.3 -0.5 -0.7 


t of the faculty. 


nanan i i d federal control. 
Гг” 22 1 dfederal support of higher education will mean increase 
í а ncrease 


(Table 1 is continued on following раве.) 
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TABLE 1 (Continued from previous page) 
Opinion Type 
No. Statement i m Ш 
23 The typical undergraduate curriculum has suffered from the specialization of “08 a4 014 
faculty members. 
24. Colleges should admit socially disadvantaged students who do not meet normal Өк oos 
entrance requirements. $ 
iti instituti i {fering the Negro 
ditionally Negro institutions serve a necessary function by о! 1 , : _ 
id des a ries lum which more nearly meets his needs and educational background. -1.0 -1.2 -1.3 
26. À coeducational institution provides a better educational setting than a college for i 4 ұз 
only men or women. . z a 
21. Collective bargaining by faculty members has no place in a college or university. -0.8 -0.2 -0.1 
28. Running a university is basically like running a business. -1.4 -0.9 -0.9 
29. Fraternities and/or sororities or similar social clubs provide an important and 
positive influence for undergraduates. -0.2 0.2 -0.5 


* Statements from the College Trustee Study questionnaire, 


All rights reserved. Adapted by permission. 


saw the grouping together of statements related to 
problems of current social conditions and, to a less- 
er extent, problems of curriculum revision and re- 
organization of the academic community. Factor 3 
differed from Factor 2 on curriculum; the former 
was concerned with changing the curriculum to meet 
new demands, while the latter appeared more inter- 
ested in pitting “teaching” against “research. ” The 
two statements best typifying Factor 3 were: 


(Name of university) should be actively 
engaged in solving contemporary social 
problems. 


Colleges should admit socially disad- 
vantaged students who do not meet nor- 
mal entrance requirements. 


These four factors, although accounting for only 
30.18 percent of the total variance, appeared to ex- 
haust the dimensions the respondents preceived to 
exist within the set of twenty-nine statements (see 
Table 3), It should not be assumed that these four 
dimensions exhaust all possible meaningful factors 
that different sets of respondents might perceive or 
that might result from a different sampling of items. 


Opinion Types 


For the Q-analysis, the raw score matrix was 
normalized before correlations were Computed. The 
factor analysis was a principal axis solution with ro- 
tation to orthogonal (Varimax) simple structure. The 
minimum eigenvalue criterion for factoring was 1.0. 
For each Q-type, that is, opinion type, standard 
scores (z-scores) were calculated for each of the 
twenty-nine statements according to procedures or- 
iginally outlined by Stephenson ( 


11). A z-score dif- 
ference of + 1.0 served as the criterion 


oe for a meaning- 
ful difference between opinion types on 


any statement, 


Copyright 1968 by Educational Testing Service. 


Thus, z-scores for a given opinion type which were 
greater than +1.0 indicated strong agreement with 
the statement; z-scores less than -1.0 indicated 
strong disagreement, And where z-score differenc- 
es across all types were less than + 1.0, the state- 
ments were considered consensus items. 


The Q-analysis isolated three basic patterns of 
opinions—accounting for 41.64 percent of the total 
variance—among the 109 respondents, 2 Again, oth- 
ег opinions may exist among these faculty members. 
Likewise, these three patterns do not necessarily en- 
compass all members of the university faculty. On 
the other hand, the investigators are fairly confident 
that these patterns were the predominant patterns 
among most of the faculty at the time of the survey. 

To begin with 
the points on wh 
Twelve of the t 


; it might be useful to summarize 
ich all respondents generally agreed. 
wenty-nine statements were consen- 
Sus statements, and all twelve Were concerned with 
Socially oriented problems, This suggests there 15 
less disagreement among faculty members over the 
third R-factor dimension- social concern-which i$ 
probably the most socially acceptable of the twenty" 
nine statements. The two consensus statements with 
which there was Strong agreement were: 


(Name of university) should serve as 
a cultural center for the population in 
the surrounding region. 


A coeducational institution provides а 
better educational setting than a college 
for only men or women. 


Neither statement appears to be the type that 
would arouse widespread argument. But this en 
not be true of the following statement wit 
which there was strong disagreement- 
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TABLE 3 


R-FACTOR SIMPLE STRUCTURE MATRIX N = 130, VARIMAX ROTATION* 


Statement Factor 1 Factor 2 Factor 3 Factor 4 h? 

1 . 690 -.005 ‚240 ‚064 539 

2 ‚146 -.416 -.090 .246 эң 

3 -.173 ‚065 -.020 522 908 

4 ‚лот -.085 -016 „1 n 

5 2654 2061 -.153 --163 481 

6 -.034 -.064 102 615 .393 

1 .648 .128 065 -.102 ‚450 

8 -.196 .158 168 174 120 

9 .028 -.388 014 214 197 

10 .165 . 650 -.105 151 483 
п -.213 -.025 510 198 ‚315 

12 ‚293 .507 -.021 099 353 

13 -.030 -.077 2400 056 170 

14 2137 .022 .381 .026 .165 

15 . 461 .298 .305 -.082 401 

16 2539 -.124 2037 -.119 321 
17 -. 217 .015 .041 2476 2276 
18 -.311 -.019 -.092 .331 .261 
19 -.088 ‚311 073 ‚224 ‚164 
20 . 588 . 066 .094 -.353 . 483 
21 .125 -.156 . 189 -.144 ‚096 
22 ‚251 ‚014 -.026 -.161 ‚095 
23 . 006 . 451 .012 -.003 . 203 
24 -.401 .097 . 488 -.051 ‚411 
25 394 -.009 -.266 . 033 2227 
26 .113 . 082 .359 . 002 .148 
27 ‚450 -.090 ‚025 -.363 ‚343 
28 ‚325 ‚205 1125 -.330 2212 
35 ‚140 -.061 ‚282 -.159 128 
— 15. 66 5.19 4.66 4.07 30,18 


ж Principal axis solution; Minimum eigenvalue criterion =1.000. Four chosen eigenvalues = 4.5402, 1.6788, 1.3528, 1,180 


Traditionally Negro institutions serve 
a necessary function by offering the 
Negro student a curriculum which more 
nearly meets his needs and educational 
background. 


All respondents also disagreed with the statement 
that ‘‘running a university is basically like running a 
business.” Respondents disagreed mildly with state- 
ments that collective bargaining should not be used by 
a university faculty, and that more federal money for 
higher education will bring more federal control. 


Four consensus statements with which there was 
agreement revolved around making higher education 
available to anyone who wants it, giving special con- 
siderations to disadvantaged students, designing cur- 
ricula to serve highly divergent needs and interests, 
and encouraging active participation by the university 
in solving social problems. Opinions about needed 
changes in the grading system were mixed, although 
the differences were not substantial. 


Correlations between z-score patterns for the 
three opinion types appear in Table 4. Although two 
of the coefficients were statistically significant, the 
relationships were low, the largest accounting for 
only 20.25 percent of the variance. The z-score pat- 
terns for all opinion types appear in Table 1. 


The most prominent characteristic of Type I (N- 
52) seemed to be a concern for academic and person- 
al freedom of both faculty and students and a concern 
for faculty participation in the governing of the uni- 
versity. To а greater extent than all other ty pes, 
Type I felt a need for more faculty representation on 
the university's governing board. 


TABLE 4 


CORRELATIONS BETWEEN z-SCORE OPINION 
TYPES 


Opinion Type II ш 


I .28 .45* 
п .43* 
* p= 06 


_ Type I also felt faculty members should not be sub- 
ject to administrative reprisal for expressing their 
Opinions freely, and that the faculty should have ac- 
Cess to university communication channels (the stu- 
dent newspaper) to make its thoughts known. Type 
Ш agreed on both counts, albeit to a substantially 
lesser extent, while Type II disagreed on the uit 
Point. Type I also would extend freedom to ари 
Others as evidenced by its strong disagreement wi 
this statement: 


All campus speakers should be subject 
to some official screening process. 


Types II and III felt all campus speakers should be 
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screened before being allowed to speak. 


Type I disagreed only slightly with the statement 
that students who disrupt the campus should be ex- 
pelled or suspended; other types felt such students 
Should be expelled. Type I also strongly disagreed 
with the idea that the administration should exercise 
control over the content of the student newspaper; 
Type III disagreed only mildly, and Type II favored 
administrative control. 


To a lesser extent than did Type II, Type I dis- 
agreed with the statement that the PhD and EdD are 
overemphasized in recruiting new faculty; Type III 
felt such an overemphasis exists, Type I also, less 
than the other types, disagreed with the statement 
that the university president should serve more as 
a mediator than a leader. Finally, while the other 
types disagreed, Type I agreed with this statement: 


An active research interest is a pre- 
requisite for good undergraduate teach- 
ing. A man who does no research on 

а subject soon becomes less qualified 
to teach it. 


What are the demographic characteristics of 
Type I? Analysis of data in the multiple linear re- 
gression accounted for 32,48 percent of the variance 
of the Type I factor loadings (see Table 5). Three 
variables accounted for significant proportions of 
variance, and the Type I individual is most likely to 
be a **liberal"' and a Democrat who has been on the 
campus less than 10 years. 


Type П (N - 27) displayed what might be termed 
a power and control orientation, perhaps an academ- 
ic version of the current political “law and огдег” 
slogan. Typifying this characteristic was the follow- 
ing statement, with which Type II agreed and theoth- 


er types disagreed: 


The (name of university) administration 
should exercise control over the content 
of the student newspaper. 


Perhaps more significant was Type Is noncommit- 
tal response to the statement that it is reasonable to 
require faculty to sign a loyalty oath. Both Types I 
and III felt strongly that the requirement is unrea- 
sonable. Type II did not believe faculty should have 
the right to express their opinion freely on any issue 
through university communication channels; while 
Types I and III strongly objected to college officials 
disciplining students who take part in civil disobedi- 
ence off campus, Type II felt only slightly that the 
administration should not take punitive action, Type 
II strongly disagreed with the statement that the typ- 
ical undergraduate curriculum has “suffered dum 
the specialization of the faculty" and йаа 
strongly with the statement that the importance о 
the PhD and EdD is overemphasized in recruiting new 
faculty. Type I disagreed on both of these points е 
a substantially lesser degree than did Type II, an i 
Type Ш agreed with both points. Type Па1во shone ly 
felt that attending the university is a privilege, no : 
а right, and that the university should be poer 
ed with the personal values of students as with their 
intellectual development. 


Two demographic variables— po litical party 
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TABLE 5 


PREDICTOR VARIABLES ACCOUNTING FOR SIGNIFICANT PROPORTIONS OF VARIANCE FOR EACH 


OPINION TYPE 


Variable Variance p-value Toe y 
Type 1 
Liberal 9.48 = ,01 . 4835 
Democrat 18.67 = .01 ‚4516 
Less than 10 years on campus 4.33 = 05 „2105 
TOTAL 32.48 
Type П 
Liberal 4.55 = ,025 -, 3566 
Republican 16, 26 = 01 ‚ 4033 
TOTAL 20. 81 
Type III у 
Liberal 3.25 = 05 -, 2256 | 
Везеагсһ 6.65 «05 -.2506 
More than 10 years on campus 5. 78 = .05 ‚2404 5 
TOTAL 15. 68 


* zero order correlation coefficient between predictor and criterion variable. 


preference and political orientation—accounted for 
20, 81 percent of the factor loading variance. Type 
П is most likely to be a Republican, and he is quite 
unlikely to consider himself a **liberal." 


Differentiating Type III (N=30) from the other 
types was its ‘teaching versus research”? orientation. 
Type III most strongly agreed with this statement: 


Teaching effectiveness, not publications, 
should be the primary criterion for pro- 
motion of faculty at (name of university ). 


Types І and II also agreed with the statement but to 
a substantially lesser degree. Type III strongly dis- 
agreed with the statement that “ап active research 
interest is necessary for good undergraduate teach- 
їп?! and that a faculty member who does noresearch 
«pecomes less qualified” to teach a subject. Type 
II also disagreed but to a substantially lesser extent, 
d Type I agreed. Type III felt strongly that the 
ж of the PhD and EdD is overemphasized in re- 
value new faculty. Type III did not feel— as did 
cruiting es—that in making admissions decisions, 
the other B ede should be the most important сгі- 
ion. 
^ ы demographic characteristics accounted for 
г 


15. 68 percent of the variance of the Type III factor 
loadings. Type III has been on campus for mor e 
than 10 years and almost certainly is not involved in 
research. Hedoes notconsider himself a ‘liberal. ” 


DISCUSSION 


Any attempt to identify the concerns and opinion 
patterns of a university faculty involves certain ей” 
cumbrances difficult to dislodge. Obviously, results 
pertain only to the particular time of the investiga" 
tion and, when involving a single institution aS di 
this study, to a particular university. In addition, 
this study dealt with a limited set of statements 45 
well as a limited set of respondents. Thus, while 
the three opinion patterns may be representative o af 
the feelings of a large proportion of faculty on cam 
pus, they cannot be considered as reflecting either 
the **average" faculty member's opinion or а á 
the possible opinions held by the faculty on these s 
sues, Other sets of statements with these and other 
faculty members may well produce somewhat diffe 
ent responses. 


Nonetheless, it may be presumed that these pose 
opinion types do reflect some of the diversity о one 
ion that exists within a university faculty. Rest pou 
certainly lend empirical support to speculation? 


” 
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the division of thinking among today’s faculty. Rath- f Hi " 
er sharply delineated, for example, was the splitbe- Tes idis ai rar болады gap." 
tween teaching and research orientations. Similarly " ы á D 
apparent was the division between those calling for 


more administrative control and those desiring less “ 
such control. Results also pointed up strong faculty $ el ci ТИ CLADE pero 
concern for the university functioning as a social nal of Experimental Education 33:379-382, Sum- 
catalyst. mer 1965. Ju ^ 
Efforts to describe demographic characteristics 
of opinion Types II and III were not particulary suc- 4. Graybeal, William S., **What the College Fac- 
cessful, as evinced by the relatively small propor- ulty Thinks,” NEA Journal, 55:48-49, April 
tions of variance accounted for by regression on de- 1966. 


mographic variables. Despitethis, Q-factor analysis 

and multiple linear regression would appear to be 

useful methods in achieving better descriptions of 5. Harman, Harry H., Modern Factor Anal: sis, 

opinion patterns relevant to issues of faculty concern Second Edition Revised, University of Chicago 

and in identifying individuals reflecting differing opin- Press, Chicago, Illinois, 1967. 

ion patterns. The need is for replication and more 

extensive study. Such efforts could help in under- 

standing the position of the faculty member in today's 6. Kelly, Francis J. and others, Research Design 

rapidly changing structure of higher education. in the Behavioral Sciences: Multiple Regression 
Approach, Southern Illinois University, Carbon- 


dale, Illinois, 1969. 


FOOTNOTES 7. Lorimer, Margaret F. ; Dressel, Paul L, ч 
“Faculty Characteristics—College and Univer- 
p sity," in Ebel, Robert L. (ed.), Encyclopedia 
1. The authors gratefully acknowledge cooperation of Educational Research, Fourth Edition, The 
of the Education Testing Service and, particu- Macmillan Company, New York, 1969. 
larly, ETS’s permission to adapt questions 


from the College Trustee Study. 
8. “Most College Trustees Found White, Protes- 


tant, Republican," The Chronicle of Higher 
2, Factor and z-score matrices for the Q-analysis Education, January 13, 1969. 
are available from the authors upon request. н 


Send requests to Dr. Kenneth Starck, Depart- 
ment of Journalism, Southern Illinois Univer- 9. Richardson, RichardC., Jr.; Blocker, Clyde E., 


sity, Carbondale, Illinois 62901. **An Item Factorization of the Faculty Attitude 
Survey," The Journal of Experimental Educa- 
tion, 34:89-93, Summer 1966. 
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DEVELOPMENT OF 


TO ASSESS 


SCHOOL AND COLLEGE STUDENTS 


RUSSELL N. CASSEL 
University of Wisconsin- Milwaukee 


The inquiry sought to develop a semantic 
among secondary school and college students. 


ordinal scale, for student rating purposes. 


dent. 
total score 
was made involving 237 students, 
dent). 
ranging fro 
part scores 
er homogeneity for post-cour 
learning, and student. 


m r =.421 to .610; 


THE OBJECTIVE of the inquiry was to de- 
ychological instrument for use ІП assess- 
ing the attitude of secondary school and college stu- 
It sought to establish semantic scales for use 
tic differential on the basis of rigid adher- 
and standardization 
(5), to use adjectives for the develop- 
tic scales that proved to be criticalin 
to use Likert type scoring of the 


procedures 
ment of seman 
previous studies, 

semantic differential with part scores for 
concepts,” i 
terion variables. 


DEVELOP djectiV 
cut College for W . Fi these a 
Inasmuch as the reliability of a psychological in- were ү ру ИЕ а with superio 
strument is in large part a function of the number of grades at the end of the first semester, all 0: ives 
items contained as a‘ sample of behavior of crite- were on the Dean’s list, and with opposite аајес 
it was deemed that a minimum as follows: 


ing assessed, 


rm i 
Id be essential. 


; wou 
NT OF SEMANTIC SCALES 


semanti 


THE ATTITUDE OF SECONDARY 


A Likert type scoring was accomplished with 
being the sum of the three part scores. 
which showed significant change only for the concept 


Internal reliability indexes were obtained usingt 
and for the total 


for pretest ranged from r= .530 to 
se concepts than for 


and to validate against meaningful cri- 


MENT OF SEMANTIC DIFFERENTIAL 


ore semantic scales, usedas individual 


c scale was comprised of a rating 


A SEMANTIC DIFFERENT IAL 


ABSTRACT 


differential (SD) for use in assessing attitude and attitude change 
It included thirty-five bipolar adjectives, each using а 7-poin 


Three concepts were used in the study: teacher, learning, and 507 
part scores for each of the separate concepts, and with the 
titu! 


and post-college course ati 
*tstudent'* (Me as 
he Kuder -Richardson(K-R) Formula 20 for par 
score ranging from r = .928 іо .960. Intercorr elatio 
1584; and for posttest r = .620 to .707. There is evidence 
pre-course concepts used in the evaluation, i.e,, tea 


A comparison between pre- 


of great- 
cher» 


scale anchored by bipolar adjectives, and as tradi- 
tionally used for the semantic differential (6). 


The Adjective Check List, by Gough and Heilbrun 
(2), has been used extensively in connection with the 
identification and evaluation of adjective 
criticalness in relation to human behavior. 
ingly, twenty of the thirty-five semantic scales 
in the final standardized Semantic Differential- og 
Secondary Students (see Appendix) were develope ы 
from adjectives suggested іп studies as being crit 
cal by studies using The 1 


practical - imaginative 
thorough - partial 
logical - illogica : 
sympathetic - unsympathetic 

appreciative - unappr eciative 


The other five adjectives were selected by freshmen 
women with inferior grades at the end of the first 
semester, all of whom were on probation, and with 
opposite adjectives as follows: 


affectionate - hateful 
forgiving - unforgiving 
frank - deceitful 

loyal - disloyal 
tolerant - intolerant 


Ten more of the adjectives were reported in The 
Adjective Check List Manual (2)for a study ot 
295 males, with six coming from those having high 
Scores on the ** Mathematician Scale" of Strong 
Vocational Interest Blank, and four from those having 
low scores: 


High Scores 
civilized - uncivilized 
curious - indifferent 

£^ insightful - blind 
original - imitational 
rational - irrational 
sensitive - insensitive 


Low Scores 
lazy - ambitious 
narrow interest - broad interest 
shallow-deep 
simple - complex 


The remaining 15 semantic scales were taken 
from studies that clearly indicated the factorial 
identity of each (4,5,6): 


Evaluative Factor 
wise - fcolish 
successful - unsuccessful 
valuable - worthless 
honest - dishonest 
interesting - boring 
pessimistic - optimistic 


Familiarity Factor 
clear - vague 
usual - unusual 
disorderly - orderly 
conservative - progressive 


Activity Factor 


active - passive 
excitable - calm 
inhibited - uninhibited 


strong - weak 
fast - slow 


SCALES USED FOR RATING PURPOSES 


A "I- position ordinal scale was interposed 
between each pair of bipolar adjectives forming the 
thirty-five semantic scales. The seven positions 
on each scale were defined as follows: (1)extremely, 

moderately, (3) slightly, (4) neutral, (5) 
Slightly, (6) moderately, and (7) extremely. 
The subject is asked to rate a concept on the 7-point 
Scale in terms of which of the two bipolar adjectives 


is believed to be most appropriate, and in terms of 
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the four adjective positions adjacent to such word, 


Concepts Used 


Three different concepts were used in the stan- 
dardization of The Semantic Differential for Secon- 
dary School Students (S-D): 


What I learned in this class, 
The teacher of this class. 
Me as a student, 


‘Each one of the three different concepts made use of 
the same thirty-five semantic scales described. 


STANDARDIZATION 


Six hundred and ten student records were used in 
the standardization process. About half of them 
287, were from high school students; while the r e- 
mainder, 323, were from upper- division college 
students or graduate students. 


Item Retention and Revision 


All semantic scales were subjected to an item 
analysis, and only those items that correlated, 20 or 
better with the total score for all three concepts 
were retained. Three separate revisions were nec- 
essary before reasonable stability was established, 
and where an г of . 20 or better was established for 
two of the three concepts utilized, i.e., (1) What 
I learned in this class, (2) The teacher of this 
class, and (3) Me аз а student. 


Assigning Weights to Semantic Scales 


Each of the thirty-five semantic scales was 
assigned values ranging from 1 to 7 for the seven 
adjective positions on the interposed ordinal scales. 
The initial step in the weighting involved identifying 
those adjective pairs where one of the adjectives 
seemed clearly to be desired to the other, and the 
value of 7 was assigned to the side of the semantic 
scale with that adjective, with the 1 being assigned 
to the other, i.e., practical, thorough, logical, ap- 
preciative, honest, loyal, and the like. Astatis- 
tical technique was then used to determine on which 
side the value of 7 was to be assigned on the seman- 
tic scales where it seemed questionable which ad- 
jective of the bipolar pairs was to be desired, i.e., 
original, active, excitable, narrow interest,conser- 
vative, fast, tolerant, etc, (7). 


Reliability 


Data contained in Table 1 illustrates internal con- 
sistency type of reliability for each of the three part 
and the total scores by use of the traditional K-R 
Formula 20. The part scores range from an r of 
. 421 for Part П - Learning, to an r ої. 610 for Part 
II-Student. Total score reliabilities were comput- 
ed for three different variations of the K- R 20 For- 
mula, i.e., Traditional K- R 20 assumes all items 
have equal difficulty and correlations; Cronbach А1- 
pha obtains correlation for all possible splits of the 
test; while Horst corrects for dispersion of item dif- 
ficulty. When there is little dispersion of item diffi- 
culty, there is little difference among the r's ob- 
tained for the three variations of the K- R Formula 
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TABLE 1 
INTERNAL CONSISTENCY RELIABILITY (N=610) 


Variation of PartI Part П РагіШ Total 


K-R 20 S-D 
Teacher Learn- Student Score 
ing 
Traditional .534 .421 .610 .928 
Cronbach's .929 
Horst’s . 960 


20. Since the r obtained for the Horst variation of 
the K- R Formula 20 is considerably larger thanfor 
the traditional K- R 20 and the Cronbach Alpha, itis 
obvious that there was considerable dispersion of 
item difficulty in the study. 


Scoring of S- D 


Three part scores were computed based on the 
Likert technique (3). Each semantic scale (test 
item) received a weightfrom 1 to 7, with the value 
being assigned to the right side when a single aster- 
isk follows the scale, and on the left when two aster- 
isks follow the scale as shown їп theappendix. The 
same thirty - five semantic scales were used for all 
three concepts, with each concept representing 
a part score, and with the sum of the three part 
scores being the total score on the S- D: 


Part I - Teacher: (ratingfor*' Theteacher of 
this class "'), 

Part П - Learning: (rating for ‘* What Ilearned 
inthis class”’), 

Part III - Student: (rating for ** Me as a student 
in this class’’), and 

Total S-D Score: sum of the three separate 
scores. 


TABLE 2 


INTERCORRELATIONS OF PRE-COURSE 
S- D SCORES (N= 243) 


Scores on Part I Part II PartIII Total 

S-D S-D 
Teacher Learn- Student Score 

ing 

PartI- Teacher . 538 -584 .876 

Part II - Learning .530 .807 

Part III -Student . 908 

Total 5 - DScore 

Mean 191.81 173.76 166.69 532.25 

Standard 

Deviation 24.13 28.96 18.88 63.97 


Intercorrelations of S- D Scores 


The intercorrelations of scores on the S- D were 
computed separately for pre- and post-course ad- 
ministration and as illustrated in Tables 2 and 3re- 
spectively. The means and standard deviations for the 
pre- and post-course administration of ће S- D were # 
also included. The shift in change of corre lation | 
coefficients is in a direction of greater common var- 
iance between the student and both the teacher and 
learning, and with the greatest shift being towards 
embracing values of teacher, i.e., from an г ої. 584 | 
to ап r of . 707. By comparing the means for the pre- 1 
course S- D administration from Table 2, with the 
post-course means in Table 3, it can be seen that 
the greatest change takes place with student, as op- 
posed to the teacher and learning. 


Criterion Study 


Two faculty members from the Educational Ps 2 
chology Department at the University of Wisconsin- 
Milwaukee, an assistant professor and a full profes- 
Sor, were involved. Two hundred and.forty- three st Т 
dents were asked to score the S-D at the beginning of t E 
semester, but only 237 of the same students completed 
itatthe end, An analysis of variance for correlations 0 
servations was accomplished to determin 
a significant change in the attitude of studentsas m 
sured by the S-D for the three concepts included. 7 di 
data illustrating the findings of that test are contained ini 
Table 4, The only statistically significant change Й 
indicated in Table 4 is for Part Score Ш- Studen 
and which deals with the student’s own OP отоп 
of himself as a student. The change is in pen 
of greater esteem for self, with little or no md. к. 
cant change іп either the teacher or what he eem 
he learned during the particular cour $e i they 
Based on this finding, students appear to 88 urSe$, 
have changed for the better as a result otthe c? in- [ 
but the basis of that change does not appear е, cher 
volve a change in attitude toward either the tea 
or what they have learned. 


he 


Factor Analysis 


Two separate principal comp! 
ses were aenomplished involving the l05s em Ss ene 
scales (items on the S- D) as variables, and diza 
tire 610 subjects involved in the initial standar reno 
tion process. The data for these two analyses ^ iut 
included, as they are too voluminous, and iu 
little to the findings. The first of the two es env 
extracted twelve separate factors with 0 5 
or more, with the first factor accounting 10 actor? 
cent of the total variance, and all twelve í 
accounting for 83 percent. When these twelve 
tors were rotated to simple structure by pe each, j 
varimax orthogonal method, four factors Yn ont Jf 
of the three concepts were obviously in а рее lie 
with the Osgood factor content of semantic ° уут 
initially included: I- Evaluative, IT- Activity, 
Familiarity, and IV-Potency. 


The second factor analysis was done eom on 
same data, and in the same manner, except 

three factors were extracted. This was аспе ane í 
termine if thefactorialcontent for the three © {р 
(teacher, learning, and student) was more por" ghe 
the factor identification of the semantic scale: 


~~ 


тә 


| 


` 
è Standard 
Deviation 
first of the three factors accounted for 60 percent of 


Же ee eS 


| INTERCORRELATIONS OF POS Л 
T-G 
3- D SCORES (x 237) "b 


гв оп Part I PartII Part Ш Total 
-D S-D 
Teacher Learn- Student Score 
ing 

Part I - Teacher . 687 ."07  .843 
, PartI- Learning .620  .812 
y Р 
! ‘art III - Student .878 
1 Total S-D Score 


Mean 194.80 174.84 187.38 557.02 


21.53 28.09 20.73 62.33 


.. the total variance, but all three of the factors only 

\ accounted for 70 percent of the variance. Interaction 
of the semantic scales and concepts seem to follow ће 

_ Pattern described by Nunnally (5) and others, where 
е loadings for the concepts are more factorially 

Potent than for the semantic scales. 
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2 4 НЕ SEMANTIC DIFFERENTIAL FOR SECONDARY 


CHOOL STUDENTS 
This SEMANTIC DIFFERENTIAL 18 intended 
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for use in assessing the attitude of persons in rela- 
tion to certain concepts related to learning. It con- 
sists of thirty-five Semantic Scales (adjective ant- 
onyms) which are related to effectiveness in student 
learning; both at the secondary and college levels of 
instruction. The three important concepts which 
have been used in the preliminary validation of this 
instrument are: (1) Teacher, (2) Learning, and (3) 
Student. Any number of other pertinent concepts may 
be used with the same thirty-five SemanticScales 
contained in this instrument. 


General Directions: 


Each of the following pages contains a ‘concept’ 
at the top which is believed to be related to how well 
you have learned, and with thirty-five different pairs 
of opposite adjectives, which are called * Semantic 
Scales.' The concept is different for each page, but 
the thirty-five adjective pairs are the same. Youare 
to mark each of the thirty-five adjective pairs, which 
we will refer to as ‘Semantic Scales, ’in relation to 
how you actually feel about the concept at the top of 
the page. The concept at the top of the first page is 
** WHAT I LEARNED IN THIS CLASS, ” and we have 
used this concept in the Example that follows: 


EXAMPLE: WHAT I LEARNED IN THIS CLASS 


EX MO SL NE SL MO EX* 


(EE Жа 6 09 1 шы 
(2) Ugly : : ы C __:Х Beautiful 


(3) Easy 


* EX = Extremely: МО = Moderately: SL -Slightly: 
NE - Neutral: SL - Slightly: MO = Moderately: EX = 


Extremely: 


If you think that «© HOW WELL YOU LEARNED IN 
THIS CLASS” was strange, make an “X near the 
word strange; but if you think it was more familiar, 
mark the‘ X" near the word familiar. The example 
with the “X” right next to strange indicates that the 
student thought WHAT HE LEARNED IN THIS CLASS 
X” right next to beautiful sug- 


was strange. Тһе“ 
WHAT HE LEARNED IN THIS 


gests that he felt 
CLASS was beautiful, For the “ Еаѕу-Нагӣ” Sem- 
tive antonyms) the '* X” is placed 


ntic Scale (adjec 
right in the middle of the scale, or about half-way 


between easy and hard. This isaneutral position 
indicating that the student felt that WHAT HE LEARN- 
ED IN THIS CLASS was neither easy nor hard, It was 
probably some of each; so he placed the “Х” in the 
middle of the Semantic Scale for that concept. 


Remember each page contains a new and dif- 
ferent concept, but the same thirty-five Semantic 
Scales are used. You are to mark each of the thirty- 
five Semantic Scales for all of the concepts included. 
When you are finished, turn the booklet face-down. 


(Appendix is continued on the following page) 
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APPENDIX (Continued from previous page) 


WHAT I LEARNED IN THIS CLASS 


Extre- Moder- Sligh- Neut- Sligh- Moder- Extre- 
mely ately tly ral tly ately mely 


1. practical ee; ee MEE Imaginative’ 
2. thorough NN NE s 222 partial*" 
5. logical : : 2-22 : Ша illogical'* 
4, sympathetic —— : : MES : $ ___ unsympathetic*" 
5. clear РИИ $ 5 $ 2. vague** 
6. appreciative oo: NN. : : — unappreciative"* 
7. civilized t : es % : — uncivilized*"* 
8. curious - i ЕРИНИ. : : — indifferent'* 
9, insightful : ді NEN. d 1. blind** 
10. original i 1 $68 t 1. dmitational** 
11. rational : 2 NN. i : irrational** 
12. sensitive : 2 ГНЕ. : "———-— 
15. wise : : Paes : :_foolish** 
14, interesting: : RN E u—— 
15. successful — — — c — ТЕРЕН a 
16. strong : : НЕЕ" : ap 
17. active : : жі 222 : 2 .Раввіуе%” 
18. Ғав% ——e—— көнек ек o og io 51ow** 
19. usual — — M ЗЕ IEEE GEEK :____ unusual’ 
20. valuable o : 9  //0 9 7, M 2  Worthless** 
21. excitable : — ee ee calm** 
22. honest ы ——— эне L dishonsst'* 


23, affectionate — — ee QÓ——À hateful** 


2b, forgiving ^— — —— UN SRI. 1 


unforgiving’? 


25. frank — — t C —— NE: deceitful** 


26. loyal —————— MÀ! disloyal** 


27. tolerant |. MM m intolerant** 


28. реввітіііс BÉ optimistic* 
29. lazy "nium == SSS ER ambitious* 
z0, narrow Xnterest 1. анаа et 1 broad interest* 
° ; 5 f — — S deep* 
low А ————— 
31. shal š i s қ 
: б a СЕРИЕН лыБенениЕ.. complex’ 

32, simple _————— | à omplex | 

conservative LL —— --- : ргоргеввіуе 
Ж» а : -— — — 2 uninhibited*" 

ted | LL—————— 

ah, inhibi к : : H : 5 orderly” 


disorderlyL—— — — — 


55, 
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EFFECTS OF NEUROLOGICAL TRAINING 
ON PSYCHOMOTOR ABILITIES 


OF KINDERGARTEN CHILDREN 


RICHARD D. CORNISH 
Unified School District Number 1, Racine, Wisconsin 


ABSTRACT 


1 That neurological dysfunction accounts for many cases of academic failure is well documented, but specific 
research on programs designed to overcome this dysfunction is sparse. In this study fifty kindergarten children 
having perceptuomotor and/or psychomotor deficits as identified by several well-known scales were assigned to 
àn experimental or control group. The experimental group was then given a program of cross-patterning exer- 
cises for 3 minutes a day over a 3-month period. The obtained results Suggest that a neurological training pro- 
gram does not significantly improve the psychomotor functioning of kindergarten children. 


THE CA USES of academic failure (children bers of children whose academic achievement does 
Who benefit to a limited degree from instruction re- not correspond with intellectual potential runas high 
Ceived) are many. Indeed, there may be as many as 30 percent of the school district population (4). 
Causes as there are academic failures. Research in- Although McLeod (16) holds that etiological factors 
to the etiology of failure has been carried out by ed- are too varied or evasive to be of concern, he does 
Ucators, Psychologists, psychiatrists, pediatricians, recognize that neurological dysfunction accounta for 
and Sociologists to name a few. Causal factors de- many causes. He states, “А study of the behavior 
lineated by these researchers by and large fall into of the undeveloped learner will reveal the majority 


their own area of specialization, but basically into of the children experiencing БЕ Ееее cioe е 
five categories: (a) Congenital defect or deficit, (b) ing their basic learning skills also have 4 а 
| i ganizing their visual, auditory, and motor experiei 


(qy ronmental influence, (с) psychological factors, à 
4) physiological factors, or (e) various combina- es (16:27). 
For reasons of parsimony, all types of academic 


tions of the preceding four (16). -— 
failure will not be discussed here. Inthe ensuing dis- 


Inshort ineateda single causal 1 Kt be discussed! eee 
factor of rt ee qa p (16), hile rec= cussion, readline, аа ун E A ар p 
OBnizing that etiological factors must eventually be comua wi нат 
treated, holds that they are so diverse, and methods ure. ' dap checa onl T 
th ү "da unsatlstaciony, Hablar nev еа to learning to read successfully, the 
ч Symptomatic behavior dm ag nemis same factors found in general disability (3, 9, 10, 

“> remedial reading programs, etc. 19, 21). 


Problem children typically cannot be identified at 
Ong after they should have started to read-the endo! 


That reading failure can be predicted has been 
the Second grade at the earliest. This forces all 


shown by DeHirsch, Jansky, and Langford (5). In 


i 4 S i i / ^ bsequently poor read- 
ventions ei Үр ET еа Apes Е ате сада л as kindergar- 
ФО be regen ee could be predicted. era, f = ee нне пе 8 


5 is no sma һе num- ude body image, and primitive visuomotor 
his j Il bl Estimates of t m a crude bi image, and i 
: problem. 
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ехрегїепсез, all testifying to. ..@ generalized de- 
velopmental dysfunction” (S:xiii). A battery of tests 
were developed to reflect perceptuomotor and lin- 
guistic ability at the kindergarten level. This bat- 


tery subsequently proved to have predictive validity 
for reading difficulties. 


That reading difficulty can be predicted is not a 
new idea. It had been advocated by Delacato (6) as 
early as 1959. Delacato’s work, however, contained 
no empirical verification on the prediction of reading 
problems. He held, through clinical judgment, that 
reading problems could be prevented. Delacato (6) 
also presented а method for this prevention, a meth- 
od that has become synonomous withhis name. The 
method is long and involved, encompassing such things 
as handedness, eye preference, thumb sucking, pos- 
turalization, etc. The Delacato theory raised much 
controversy and, because of its complexity, generat- 
ed very little empirical research. 


In 1963 Delacato (Т) redefined his position and 
more clearly delineated his treatment and preven- 
tion procedures. He claims his new work was the 
product of much research, but fails to report this 
research as research. The reader often cannot sep- 
arate research findings from clinical intuition. Inthe 
1963 work the major technique in the treatment and 
prevention schema is thatof cross- patterning, a tech- 
nique that has received subsequent research attention 
and the one to which the present study directs itself. 


The present study grew out of the author's work 
with the Oconomowoc, Wisconsin, Public School’s 
Learning Center. During the summer of 1968, 372 
children were enrolled in the Center’s Pre-Kinder- 
garten Clinic. Of this number 107, or just over 29 
percent, were found to be retarded or deficient in 
basic perceptuomotor skills. During the following 
year, while the children were in kindergarten, a 
cross-patterning program was instituted to see if 
the perceptuomotor difficulties could be overcome. 
The specific hypothesis tested in this study is that 
children with perceptuomotor difficulties who are 
given cross-patterning exercises will show agreater 
improvement in perceptuomotor coordination than 
children with the same difficulties who are notgiven 

the exercises. It is further hypothesized that the 
above differences will be significant at the .05 level. 


RELATED RESEARCH REVIEW 


i h done in this 
e has been very little researc 
Tue oat of what is available is found reportedby 
ht ato (8) in «|, ten carefully controlled exper- 
peso » However, these studies have many 
weaknesses. 


i i 2) compared reading disability 

sister Mariam апей ven neurologically disor- 
а aed children in a descriptive study. No attempt 
gano de to improve neurological organization or 
Ман ut ability. The study showed there is а signif- 
Hem 'lifference in reading ability between neurolog- 
icant organized and neurologically disorganized 
ically The neurologically disorganized children 
uc ү to score significantly lower in the areas 
weh omprehension, visual and auditory recognition, 
о 


апа ога1 reading performance. 


Masterman (13) dealt with neurological training 
a 


i tardation. 
;- effect on reading re 
ices and their ef! 
exercise 
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ildr pi atched 

One hundred forty-one pairs do ша ^ 
i c { grade pla 4 
according to Sex. пак place P tal grou 
member of each pair in the exper imental imn 
; -aining e» А 

and received various пешг ological tz лү т Біте 
including cross- patterning. The sae ae uy Oral 
ed no treatment. Mean £ дїп scores тонау that 
Reading Test were compared and the [нап was found 
both groups came from the same poput 
to be ‘‘near the . 01 level. 


Р = reaknesses- 
The Masterman study contains sp ene found 
Firstly, Masterman stated that any di E Sand hence 
would be in iavor of the experimental БІ vm i 

used a one-tailed test. Although this 18 S ІШУ be in 
sumption, the differences may not ea test should 
one direction and, as such, à totae { 

have been used. 


Secondly, the treatment period lasted om oU 
In that the scores on the Gray Test are bans 
month year, Masterman multiplied the pr E 
differences scores by 10 for the purpose 9, e curves 
This is based upon the assumption that learning 
are increasing linear functions. 


McGrath (15) found that ninety-two students, ner 
ing from grade 7 to grade 11, enrolled in а ah grad 
remedial reading program, read at fifth to ni е Met 
levels. All ninety-two students were given t \ 
ropolitan Reading Test, form Am аз à pretes rog 
students were then given a remedial reading P” 
consisting of the Science Research Associates А 
ing Kit, the Reader’s Digest Practice Reader, "to 2 
Catholic Charities Spelling Charts, in addition ae 
neurological training program, including cross (и ої 
terning. All students were given five 5-day е pol 
this program and then were retested on the Me S 
itan Reading Test, form Bm. McGrath reportes ta 
results verbally, however Delacato supplied a d? 


n 
ence ! 
ffer тепсе 


mean difference and found significant improvemes 
the .01 level. However, without appropriate con 
groups, the obtained difference cannot be attribute 
to any of the treatments. 


Kabot (11) studied experimental and control group? 
consisting of eleven matched pairs of third-gra 
children who, on the Stanford Reading Achieveme 
Test, scored 6 months or more lower than their 95, 
level. Following 8 weeks of neurological training: rni? 
two groups were given a posttest, using the Cal eine 
Reading Test. The results obtained when compar 


the improvement of the experimental and с ont? 
groups were not significant. 


Correlations between the Stanford Test and сай 
fornia Test are not reported. In this case, the ee 
ing program may have been effective and this aiffe 
ence lost due to the use of two completely different дё” 


To assess any possible transfer of training, роо 
of the experimental and control groups were teste" 
an alternate form of the California Test, after 7 ie 

ceiving a 7- month remedial reading program. со? 


resulting difference between the experimental andes] 


+ 


trol groups was taken as evidence that neurolog ^, 
training transferred and resulted in significantly s, 
proved performance in a remedial reading pro į 


j int? 
Alcuin (1) divided 120 Ss (method not given)! 
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three groups; the Reading, Psychological, and Neu- 
rological Groups. 
Spective treatment three times a day, twenty times 


did calisthenics, ‘so planned as not to involve th e 

whole organism in Such a way as to improve Neuro- 
All groups were giv- 

en the Stanford Reading Test, both before and after 


thetwo latter mentioned groups did not differ significant- 
ly from each other, 

Although this area is Starting to receive attention 
in the professional journals (14, 17), research ma- 
terial other than that reported by Delacato (8) is yet 
Sparse. A study by Painter (18) was designed to 
“investigate the effects of a rhythmic and sensory 


motor integration, and psychological competence of 
kindergarten children. ” 


TheSs for the experiment werethe lower 50 percent 
of aforty student kindergarten class. Relative standing 
inclass was indicated by Goodenough MA Scores. The 
twenty Ss were then divided into matched pairs on the ba- 
sis of chronological age, (CA), mental age (MA)and sex. 

The experimental group was given training ses- 
sions three times a Week over a 7-week period. No 
time was spent with the control group to compensate 
for the Hawthorne effect, The treatment method used 
was not that advocated by Delacato (6,7), but one 
patterned after that Suggested by Barsch (2). Many 
of these exercises bear а strong functional relation- 
Ship to those of Delacato, including jumping, hopping, 
Skipping, and bilateral body movement, 


In this study the author did not report results but 
merely listed the following hypotheses and reported 


sign test; the program will improve sensory motor 
Spatial performance skills: p=.002, sign test; the 

procedure will improve psycholinguistic abilities: 

р=.055, sign test. 


Noting that the Delacato method is so long andin- 
volved that controlled experiments are difficult and, 
when done, cannot be replicated, Silver, Hagen, and 
Hersh (20) attack the problem of reading disability 
through the direct stimulation of the deficit percep- 
tual areas. Ss for the experiment were eighty males 
from 7 to 11 years old, all of whom had been refer- 
red to a mental hygiene clinic for learning and be- 
havior problems. The Ss were paired on the basis 
of age, IQ, and neurological and psychiatric exami- 
nation, then randomly assigned to one of the two ex- 


Perimental groups. 


. The training sessions were individual, each last- 
118 45 minutes and held twice weekly over a period of 


а tutor. For the Second 6 months the situation was re- 


experiment, at the end of 6 months, and at the end of 
the experiment, Thus, in addition to the group con- 
trasts, each S acts as his own control, The training 
techniques are not delineated, but were stated tocover 
the visual, auditory, tactile, and kinesthetic modalities 


The experiment Was not completed at the time of 
publication, sono quantitative data was presented. Two 
case studies were Presented and the preliminary re- 
sults **showed definite progress, ” 


METHOD 
Subjects 


cular pursuit, identification of body parts, imitation 
of movement), the Draw A Man and Ten Dot subtests 
of the Anton Brenner Developmental Gestalt Test of 

School Readiness, and lack of visual fusion as meas- 


randomly assigned to treatment. E contained twenty- 
three Ss, nine males and fourteen females. C contain- 
ed twenty-seven Ss, sixteen males and eleven females. 


Ss were both pre- and posttested on the Purdue Per- 
ceptual Motor Survey, the Keystone Visual Survey, and 
the Draw A Manand Ten Dot subtests of the Anton Bren- 
ner Developmental Gestalt Test of School Readiness, 


Apparatus 

The apparatus used in this experiment was the Ex- 
er-Cor: a mechanical device that insures proper syn- 
chronization of cross- patterning movements, Theap- 
paratus is manufactured by Flick-Reedy Education 
Enterprises of Bensonville, Illinois, It is a device 
48 inches long and 14 inches wide. It has hand andknee 
pads that ride on rollers and are moved by muscular ef- 
fort. These pads are interconnected by a system of 
cables and pulleys that literally force cross-patterning. 


Treatment 

Ss were pretested in July 1968, when they were en- 
rolled in a pre-kindergarten clinic, Ss in the E group 
then trained on the Exer-Cor 3 minutes per day fora 
period of 3 months with their regular classroom teacher, 
Ss in the C group were not given any compensatory at- 
tention to control for the Hawthorne effect. In addition 
to the cross-patterning, the Ss were instructed to turn 
their heads in the direction of, andfocus their eyes upon, 
the hand which was in the forward position, In addition 
to verbal instructions, Ss were given a demonstration 
of what was expected of them in the training exercises, 
At the endof the 3-month training period, they were 


posttested. 


Tested Hypotheses 


The specific statistical (null) hypotheses tested 
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аге: (1) There is no difference between EandC groups 
in perceptuomotor skills as measured by the Purdue 

perceptual Motor Survey- (2) Thereis no difference 

between the Draw A Man and Ten 

Developmental 


as measured by the Keystone Visual Survey. The .05 
level of significance was used for all hypotheses. 


RESULTS 


Hypothesis 1 

A t-test for independent groups 
ing difference 
all subtests 


was performed, us- 
(posttest minus pretest) scores for 
(hypothesis stated above ). 


Walking The results showed that 
the control group was performing at a higher level 
than the experimental group. Results were not sig- 
nificant (t (48)= 5, p7 .05). 

Initiation of Movement Subtest. The experimental 
group scored higher than the control; results were 
not significant (t (48)=. 195, р 7.05). 


Occular pursuit. The control group scored sig- 


nificantly higher thanthe experimental group (t (48) = 
-2,097, р? .05). 


Identification of Body Parts. The control group 
scored higher than the experimental group; results 
were not significant (t (48) = -1.49, p7 .05). 


Hypothesis 2 


A t-test for independent data was performed us- 
6 difference scores in both cases (the hypothesis 
the above section). 


praw A Man, The control group scored higher than 
the experimental group; results were not significant 
(t (48) =: 510 


Ten Dot. The experimental group scored higher 


than the control group. put results were not signif- 
icant (t (48) = 0, p^.05). 


Hypothesis 3 
There is nO difference between E and C groups in 
visual fusion а5 measured by the Keystone Visual 
А chi-square test was performed on the dif- 
pre an exhit 
sion pro lems. The results were not significant (х 
(48) =.07, р> -05): 
CONC LUSIONS AND DISCUSSION 

That neurological training exercises, specifically 
cross-patterning exercises, will improve perceptuo- 
as herein defined has not been shownby 
six measures of perceptuomotor 
ere were only three instances wherethe ex- 


skills, t 
an the control, 
imental £T up im ed more rol, 
nd one case where the control group scored signif- 
i an the experimen a 
ican ne ide evidence agai! st D Jacato's theory 
белі ны However, in the presen 


> ге ‹ 
of neurological ganna other possible explanations 


the use of 
" kness of the study was 
ne ды uch better design would have been 
intact s lect SS for E and C groups from 
m 


THE jOURNAL OF EXPERIMENTAL EDUCATION 


within the same class: this would con 
er differences. Unknown to E, the te 
to the control group was conscious 
early childhood psychomotor tri 
It is possible that she ^ 
s 


the experimental group only receiv ed 


cially since 
Tt would perhaps havebeen 
А! 


minutes of training daily. 
beneficial to start this í 
ly increase it. Or, perhaps evenbetter, to have s 
treatment groups, each on à different time schedule. 


Another possibility is that neurological training 
does not increase perceptuomotor skills. This 15 а 
source of controversy with the learned men arguing 
on both sides of the fence and each side packing their 
own position with their own research. The present 

study due to its inherent weaknesses does not offer 
idence for either position. Carefully controlled re 
search is necessary. 


Future studies should be of truly random design and 
preferably a Solomon 4-group Design. his WOU 
eliminate the intact classroom problems and also €" 
hance external validity. should, i 2 
all possible, be administered руа trained 
care should be taken to prevent anything that can 
adjudged to be a form of the same treatment 0 
treatment. 
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ABSTRACT 


The purpose of this study was to determine what effect the communication of precise instructional objec” 
tives to students has on their learning. The study was designed (1) to provide data on whether student achieve" 
ment can be influenced significantly by providing students, in advance of instruction. information on what і 
expected of them аз an outcome of instruction and (2) to investigate various ways of communicating to students, 
in writing, that which is to be learned in class. The Ss for this study were selected from five tenth-grade he 
and safety classes taught by the same teacher. Of the 143 8s, one third in each class was randomly assigned 
one of three treatment groups. For treatment groups one through three, the participants received pr ecis 
stated instructional objectives, vaguely stated instructional objectives, Апа short paragraphs of health informo 
tion, respectively. Ss receiving prior to instruction precise information on what is expected of them show ed 
greater achievement than those who received vague ог related information. 


pESPI TE emerging pronouncements as to the 
ue and utility of instructional objectives to the 
ituation, many teachers andcur- 
riculum workers still look upon objectives а5 neces- 
sary decorations to satisfy the curriculum theorist 


in public settings: (1) the need for such studie Рд 
while not expressed by authorities in the literature 
seems to exist in that some in the education pro 
sion allude to the usefulness of learners having 
edge of classroom instructional objectives (8, | aS 
18596. 15:91, 17, 20, 23); (2) studies en other setting 
have indicated a usefulness in providing learner » 0 
instructional objectives in advance of instruction ee 
13:28, 20). For instance, ithas been found that ow 


jectives 5 ша : à н к Ж, d- 
pest be utilized in the teaching-learning setting 50 aS edge of objectives by adults reduces the time e. 
to favora! nfluence student achievement pirs ucating them in tasks related to their jobs, a ор e 

mately, lue of objectives to teachers will be jectives possibly increase achievement by colle 5 
the degree to which these statements serve а useful students; (3) studies that assess the effect of using | 
purpose in the teaching-learniné process. behavioral instructional objectives with teacher t 
А vealed enhanced achievement by the learners taug v 

A study was designed to determine the effect com- by these teachers (21, 24 ); and (4) other relat? 
municatio precise instructional objectives to stu- studies where instructions and advance organizet Ssg 


n o 
on their learning. The study was conduct- 
on whether student achievement 
ificantly by providing students, 
i i i hatis ex- 
f ins ruction, information on W 
s jutcome of instruction and (2) 


were used with learners prior to instruction it ару 
found that their achievement was influenced avor 
(1:235, 2, 3, 4, 5, 9, 11:268, 19:640-641, 29:119) 


Authorities have suggested that giving learne??, 
objectives will help them know what is expected b: 
them in advance of instruction on the basis th? В, 
а teaching strategy: (1) helps the learner identify 
the required terminal performance (12357-358) 
(2) assists the learner in maintaining his OWN 7/4) 


s studies trol of learning task reinforcement (13:26-21); $ 


been no reviou: $ : n 

Althougl ther nan e шыт ding learners with provides knowledge of goals to attain, which in tutes 

showing the sins va objectives prior to instruction is both instructive and motivating (14); (4) facil 
instructio 


the explorati 
Fic explota ion of alteri n 
Асе амы а rnatives for i 
kae feriae i this ere dr БТ та 
greater c аза) Co 
and tele commitment by Enos | 
tween relevant ); (6) helps learners disc а 
85-86, 2:270-271); G un learning eine ta 
certain ы ; (7) predis s EET е 
(11:269 (8) of behavior alee tia Mer Lines 
thei r prior A engages Тез леге, ue 
er's organization T (28); (9) facilitates the ae 
irect his thinking re elevant knowledge which wi ] 
215); (10) рб to the learning task (106 Т 
random learnes Сез а motive S specifie нанға 
узете behavior (ааваа rather than 
results in Ислан у, that ал a 
readiness t а earner effort i 
is much Pee (1:228 and 235), po Z5 
learners aUi en concerning the value Py oti 
Studies Supporti е5 prior to instruction teres tev 
ducted‘th the оа 20 алгарргоас) рле 
he public schools ach and none con- 


METHOD 
Subjects 


A teacher wi 
pre “Cher with fi 
poko шашу ШШК Нейл gsr 
the foes Selected as the Шын class high 
students classes there were ш у teachers. Within 
random Atthe beginning of the € of 143 tenth-grade 
Within Rd assigned to one of i udy each of theS's was 
a (hin each of the fi hree treatment groups 
Treatment grow ive classes. The assignment to 
M dom numbers! was conducted by using a table of 
Materials * 
As part о! 
ie a3 ак (Ше study the teacher was asked to con- 
s health educati on growth and development within 
" the unit was | ton program. None of the content 
Study prio s presented in the high school course 
es veloped my to the time of this study. The unit was 
ation Study (8 cordance with the School Health Edu- 
15 developin SHES) (27:42-45) concept «growing 
S Unique for g follows a predictable sequence, yet 
scr, Wit obje each individual.” The first four of the 
боо] level roel es identified by the SHES atthe high 
9r the deve} ‘or this concept servedas the framework 
designed neces of the teaching unit. The unit 
or tea ix a comprehensive teaching “pack- 
tent г "ed use and included an identification 
ral elated to each unit objective anda variety 
Б Opportunities keyed to each unit objective. 
mber of relatedpre- 
tives for student 
uctional ob- 


Of con 
Of lea 


, Fn 
Cise ud each unit objective a nu 
Use wer, Ee instructional vim 
Jectives "developed. The precise instr i 
Overt E explicit specific content, the kin 
lea 148 со ehavior expected of the learner with respect 
carner ntent, conditions to be imposed upon t 
lectives When he is demonstrating mastery of the ob- 
Perform: and the inclusion of what will be acceptable 
чеге so ance. The vague instruction objectives 
Objectives hat similar to the precise instructional 
»ehavio es except that both the objective content and 
lectives dimensions were genera - the vague 0b- 
the content was presented in ргоа не еп 


tives. 


statement of 


е 
the Ы 


Conditi 
iti 
lons to be imposed upon 


se objec 
tain а 


DALIS 
21 
demonstrating his attainment j 
e g hi: t of the objective. 
was there an indication of what would be НЫ 


learner performance. 


A total of sixteen precise and уа: instructi 
objectives was developed from the ras [pmi en 
objectives. Both the precise and vague instructional 
objectives represent a sample of a population of such 
objectives that could have been developed and that 
were implicit within the four unit objectives. In ad- 
dition, sixteen separate short paragraphs of written 
health information unrelated to classroom learnings 
were developed to serve as a placebo with the control 
group of students. Each of the paragraphs of written 
health information, precise instructional objectives, 
and vague instructional objectives was placed on a 
separate sheet ofpaper. These sheets of paper were 
referred to as "messages" whenever they were dis- 


cussed with Ss. 
For each precise instructional objective one mul- 
tiple choice test item was developed to assess the 
student's understanding of the objective. The same 
test items were used with the respective vague in- 
structional objectives since each item was butasam- 
ple of many possible test items that could have been 
prepared for these more general objectives. There- 
fore, each ‘message’? containing either a precise or 
vague instructional objective included a relevant mul- 
test item. The instructions on eac h 
* sheet directed the S to select from an ar- 
choices the one that best went with their 
For the written information used as the 
placebo а time-consuming activity comparable tothe 
test items for the precise and vague instructional ob- 
jectives was developedto accompany this information. 
ion test was developed to 
t achievement at the conclusion of the 
tunit., Also, an opinionnaire 
rtain reactions by partici- 


ray of four 
objective. 


A sixty-eight item criter! 


assess studen 
growth and developmen! 
was designed to secure се: 
pants during the study. 


procedures 
ttest-Only Control 


this study the Pos 
1:195-197) was employed. А compar- 


achievement was conducted among 
ith precise instructional objec- 

vided with a setof vague 
two), and those pro- 
up three) in advance of in- 


ison of 
those 55 pro 
tives (group 067; . 
instructional objectives 
vided with а placebo ( gro! 
struction. 
ng of the study Ss were toldthat they 
ipate in an experiment to 
messages to be giventhem 
iod would be of any 
re informed 
be receiving 
order for the experi- 
to maintain absolute 
scussion the 


assistan 
that different pe 
different messag! 
ment to work it wa: 
secrecy: Di 
topic of 


n th 
es and that in 
s necessary 


introductory di: 
ts" was discussed including the 


aining certain controls in order 
ents to work." This discussion along 
cement that the Ss would participate 
riment whereby different people would have 
nformation was pursued with the intent that 
oach of honesty would enhance student 


in an eXpe 
ith the experiment procedures. Initially, 


different i 
such an appr" 
compliance №: 
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5з were told that their grade would not be affected by 
their different messages and that they could withdraw 
from the expe riment at any time without penalty to 
their class standing. None of the students elected to 
withdraw from the study- During the 3-week unit, 
the teacher would pause at points indicated in the 
teaching unit plan and provide each student with his 
appropriate message. The messages were prepared 
in advance by the investigator with the student’s name 
on each folded message sheet. Throughout the ех- 
eriment the teacher remained unaware of the specif- 
ic character of the information being given the Ss. 


On the last day allocated for the experiment the 
criterion test was administered to evaluate achieve- 
ment of material contained in the growth and devel- 
opment unit. Also, each S responded to an “апопу- 
mous” opinionnaire coded in such а fashion whereby 
it was possible through a classroom seating chart to 
identify each respondent. One opinionnaire question 
concerned the amount of study time spent outside of 
class each day during the study of growth and devel- 
opment. A second concerned the gaining of any in- 
formation about messages given to other students. 
The test data from six Ss who indicated they had 
gained information about messages from someone 
else were not included in the analysis. These six 
Ss were evenly distributed amongthe three treatment 
groups. In addition four other Ss were not included 
in the study. Three dropped out of school prior to 
the conclusion of the study, and one was a native 
speaker of Spanish who spoke and read very little 
English. 


RESULTS 


Contained in Table 1 are data on the analysis of 
variance of the criterion test dependent variable be- 
tween treatment groups. The very high F ratio, 
10.809, clearly shows that there was а treatment ef- 
fect favoring group one, the group presented with 
precise instructional objectives prior to instruction. 
Therefore, the hypothesis indicating that the group 
presented with precise instructional objectives prior 
to instruction will demonstrate greater achievement 
than will the group presented with vague instruction- 
al objectives is accepted beyond the .99 levelofcon- 
fidence. 

Themeans onthe criter iontest scores of treatment 
groupsone through three were 40. 4, 31. 4, and 32. 1 re- 
spectively. The mean of 40. 4by the group presented with 
recise instructional objectives is significantly sepa- 
ratedfrom the mean of 31. 4 by the group presented with 
vague instructional objectives and the mean of 32, 1 by 

the group Pr esented with the placebo. Thus, the hypoth- 
esis that indicated that the group Pr esented with either 


ANALYSIS OF VARIANCE OF CRITERION TEST 


AMONG DEPENDENT VARIABLE TREATMENT 
GROUPS 
Source Sum of df MS F 
Squares 

3.085 10.978* 
Between 2106.170 2 105 
within 42470.151 130 95.924 
Total 14516.321 132 = 
*р .01 


precise ог vague instructional objectives priortoin- 
struction wil demonstrate greater achievement than, 
will the group presented with aplacebo can be assumed 
to be rejected. Major significant differences are due ^ 
to the high level of achievement of group one on the 


criterion test. 


Group one obtained a mean of 8.9 on the instruc- 
tional objective understanding test. The standard 
deviation for this group was 4.6. The vague instruc- 
tional objective group (group two) obtained a mean 
of 2.2 with a standard deviation of 1.6 on the instruc- 
tional objective test. The mean difference between 
groups опе and two was 6.7. When computed, the 
value for t was equal to 67.25, significant at the .999 
level of confidence. Since there was an extremely 
high t value, which shows the significance of differ- 
ences between groups one andtwo, there is probably 
а real difference between these groups on instruc- Б 
tional objective understanding. Due to the marked 
differences in standard deviation between the groups 
caution is warranted on accepting the extremely high 
t value. The hypothesis can thus be accepted that 
the precise instructional objective group willbe more 
able to select activities that go with their objectives 
than will the vague instructional objective group. 


By using the Kuder-Richardson formulas 20 and 
21, it was found that the reliability for groups one “ 
and two was .90 on the instructional objective under- 
standing test. This very high value indicates that 
the test as a whole, regardless of the treatme nt 
group, i$ particularly reliable, and internally сол” 
sistent. A .90 reliability coefficient is quite large 
for a 16-item test. 


А Ву the way of description, less overall average 

time was spent in studying outside of class bY group 
one than by groups two and three. Group one actually 
spent an average of 17.4 minutes, group two ^. 

minutes, and group three 20.4 minutes Studying daily 
outside of class. This average amount of time spent 
studying among the three groups, however, WaS not 
significantly different at the .99 level of confidence 


CONCLUSIONS 


` 


sible to enhance health education classroom achieve" 


According to the findings in this study it was pos- 
ment by using precise instructional objectives in? | 


vance of instruction with high-school-age learners. d 
These objectives, however, must be precisely state 
otherwise their value to learning efficiency is ubt 4 
ful, In fact, instructional objectives that are vague, 
ly stated and are general both in content and vehavio" 
may deter learner achievement when given 10 ni? 
prior to instruction. › 
2 Evidence from this study supports the idea tnat 
individuals with precise а о онем? 
quite able to select activities related to these 9 ET 
tives. Whereas, those individuals guided by vel ei 
instructional objectives seemingly became confus? 
and were unable to select activities that related 
their objectives. Apparently the vague objectiv® 
did not provide the necessary direction and infor 
tion needed to facilitate the matching of relev ant 2 
tivities to instructional objectives. 

The study findings revealed that the precision а, 
stating instructional objectives did not affect, in? 
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Way ог another, the amount of time i 

у c nc Spent studyin: 
daily outside of class by those learners being аи 
һу {һезе objectives. = 
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SOME EVIDENCE CONCERNING THE V ALIDITY 
OF AN ELEMENTARY SCHOOL FORM 


OF THE DOGMATISM SCALE 


DONALD W. FELKER 
DONALD J. TREFFINGER 
Purdue University 


ABSTRACT 


Five hypotheses concerning the validity of Figert’s elementary school form of Rokeach’s dogmatism scale 
were investigated. Pupils (N-120) from fourth, fifth, and sixth grade classes participated in the study. The 
results offered little support for the validity of the Figert test. Only one of the five hypotheses was supported: 
and that only partially- It seems necessary, therefore, to conclude that scores on this test should not pe inter” 
preted as an assessment of dogmatism unless other evidence can be obtained to provide support for the i 
validity, or an acceptable alternative interpretation of these data can be formulated. 


HE PURPOSE of this study was to investi- The present study i validation study: 
gate the validity of an elementary school form of а In addition to ыты ы. к recone sample 
dogmatism scale which Figert (4) presented аз an differences among grade levels in scores on figert 5 
adaptation of Rokeach’s (11) Dogmatism Scale. Fig- test, the present study tested several other пуроб" 
ert concluded that his scale was eses concerning dogmatism among elementary school 
кете " Р pupils. These hypotheses were derived from relati 
functioning relatively effectively as a measur- ships previously established with older Ss. 
ing device and was measuring some of the same 
facets of remmindedness-closedminge str 
among children that adult forms of the instru- Rokeach (11) presented evidence, for example 
ment measure among adults (4:20-21). to support the prediction of а negative relationship? 
between self-concept and dogmatism. Holden (1 
Some of these data, he argued, could be interpreted reported that external locus of control oF respo”? o 
as evidence for the validity of the instrument. Such bility was positively correlated with scores OP ps 
data were: California F-Scale. Many descriptions of the ©, 
relates of creativity (1, 5) have suggested that his 
(1) Mean scores for pupils in grade 4 were sig- ly creative persons are tolerant of ambiguity; je 
nificantly greater than means for pupils in grades 5 to experience, confident, self-assertive, and in 
and 6; pendent in judgment. Such a description 15 su 
in many ways to Rokeach’s (11) description of t 
(2) There was а tendency toward an inverse re- openminded individual. Rokeach (12) argued tha 
lationship between test scores and socioeconomic openmindedness may be a prerequisite cre 
class (SES) indices, although means did not differ ity; evidence suggesting à negative relationship 
significantly among SES levels; tween dogmatism and creativity was present е 
А Jacoby (8). Mouw (9) also presented evidence: 
) Means for parochial school pupils were sig- support the prediction of а negative relati shi 
nificantly greater than means for pupils in two (of tween dogmatism and complex cognitive 4 i 
four) public schools studied. 
Thus, the following hypotheses мет 4 
Figert (4:21) also suggested that the scale should lated, which, On the basis of the Pr evious 
concerning t 


е used for some scale-validation studies following cited should provide evidence 


p ques developed by Rokeach and others. struct validity of Figert’S test: 


the tec hni! 
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If the Elementary School Form of the Dogmatism 
Scale (4) measures dogmatism, then: 


(1) There will be differences in mean scores 
among several grade levels; 


(2) There will be a negative relationship between 
dogmatism test scores and scores on the Piers- 
Harris self-concept scale (10); 


(3) There will be a negative relationship between 
dogmatism test scores and assessment of internal 
locus of control, as measured by the Intellectual 
Achievement Responsibility Questionnaire (3); 


(4) There will be a negative relahonshipbetween 
scores on the dogmatism test and scores on mea- 
sures of creative problem solving abilities, as- 
sessed using a battery developed by Treffinger 
and Ripple (13, 14); 


(5) "There will be a negative relationship between 

dogmatism test scores and attitudes about cre- 

ative thinking, and between scores on the dogma- 

tism test and self-concept of creative problem 

solving ability, as measured by the Childhood At- 
4 litude Inventory for Problem Solving (2). 


METHOD 


Sample 


ü Fourth, fifth, and sixth grade classes from pub- 
lic schools in northern Indiana ( N-120) participated 
in this study, 


Instruments 


E In addition to Figert's Elementary School Form 
> the Dogmatism Scale, all pupils were given sev- 
ral other instruments. These were: 


ақ а) The Intellectual Achievement Responsibility 
sa questionnaire, which was developed by Cran- 
dall, Katkovsky, and Crandall (3); (b) The Piers- 
arris Self-Concept Scale (Р-Н), developed by Piers 
and Harris (10); (с) A General Problem Solving 
wed (GPS), developed by Treffinger and Ripple 
for | 14); and (d) The Childhood Attitude Inventory 
ton ares Solving ( CAIPS), developed by Coving- 


The reliability and validity of these instruments 
care been discussed in the rd indicated; it was 
ficie uded that, for the purposes of this study, эш 
izat Evidence was available to warrant their uti- 
teachers All tests were administered by classroom 

y 1—19, using standardized directions, and scor 
rained graduate students. 


Analyses 
Interco; i i s were com- 
Duted f rrelations among all varie Separate ma- 


Ог the entire sample, as we * 
na 65 for boys and 4 Then, grade level differ- 
exami in scores on Figert’s dogmatism test ye e 
cop ed, using one-way analysis of variance i. 
стад Праге means among fourth, fifth, and віх 
S. Next, extreme groups on the dogmatism 
highest scoring pupils and 40 lowest) Wi 


These groups were compared оп Р-Н, 


CAIPS, апа ТАВ scores. Finally, for a sample of 
forty-three pupils on whom identifiable problem solv- 
ing data were available, correlations were computed 
between GPS scores and dogmatism scores.! Тһе 
alpha level was set at .05 for all ANOVA’s and tests 
of the significance of correlation coefficients. 


RESULTS 


Hypothesis one was tested using one-way analysis 
of variance of dogmatism scores among grade levels. 
The means for grade 4 (104.37), grade 5 (105.00), 
and grade 6 (106.30) did not differ significantly 
(F<1, with 2,119 df). It was necessary, therefore, 


to conclude that hygathesis one Was TE TAHOMA, 


Correlation coefficients, to test hypotheses two 
and five, are presented in Table 1. Amongthe eight- 
een coefficients, only one was significantly different 
from zero: the correlation between dogmatism scores 
and pupils’ attitudes about creative thinking and prob- 
lem solving (CAIPS, I) was -.271. This correlation 
was in the direction predicted by our hypothesis: pu- 
pils with higher dogmatism scorestendedalso to have 
lower scores on the attitude measure. 


In order to examine the predicted relationships 
more closely, differences between pupils in the high- 
est third of the distribution of dogmatism scores and 
pupils in the lowest third were examined. These re- 
sults are summarized in Table 2. Because of the 
limited sample available, GPS scores were not in- 
cluded in these comparisons. 


TABLE 1 


CORRELATIONS BETWEEN DOGMATISM SCORES 
AND SELF-CONCEPT. LOCUS OF CONTROL, 
CREATIVE PROBLEM SOLVING, AND ATTITUDES 
LS, AND TOTAL SAMPLE 


FOR BOYS, GIR 

Variable Boys N Girls N Total N 

Р-Н -.058 60 -.024 60 -.037 120 
-.094 60 ‚010 60 -.048 120 

wee -.135 60  -.202 60 -.168 120 

GPS .048 23 .235 20 .125 43 

CAIPS I -.271* 60 -.101 60 -.221 120 


60  -.127 120 


cast -.228 60 -.114 


*р< .05 


The only significant difference boten ue m 
i on рагі 

кеде. me qu Бат g ad problem solving): 
Pupils in the low dogmatism group had ШЕ у 
йр scores on the attitude measure than pup 

the high dogmatism group. 


ION AND CONCLUSIONS 


рсе ras to 
Wi 
The pur pose of this e d provide evidence for 


eses which, if supporte tary school f 


ent 
the validity of the eleme P (4). 


le. : 
Hen tested theresults for each wil 


test several hypoth- 


orm of the dogma- 


Five hypotheses 
Ш be discussed, 


ND 
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TABLE 2 


DIFFERENCES BETWEEN THE UPPER AND 
LOWER THIRDS (N =40) WHEN RANKED 
ON FIGERT'S TEST SCORES 


Variable М5 Б MSY Е р 
Р-Н .01 166, 69 <1 n.s 
IAR + .45 4,25 <1 n.s 
IAR - 6.61 7.73 <1 ns 
CAIPS I 68.45 13.15 5.206 р<.05 
CAIPS П 19. 01 23.13 <1 n.s 


i With 1 and 78 df for all variables. 


Hypothesis one, which dealt with differences among 
grade levels on the dogmatism test, was not support- 
ed. Our results, therefore, do not substantiate those 
reported by Figert (4). 


Hypothesis two, which proposed a negative rela- 
tionship between dogmatism scores and scores on the 
Piers-Harris self-concept scale, was not supported. 
The correlation between these tests in our sample 
was not reliably different from zero; nor were there 
significant self-concept differences between groups 
high and low on dogmatism. 


Hypothesis three predicted a negative relationship 
between dogmatism test scores and a measure of in- 
ternal locus of control. Again, there was no support 
for this hypothesis, since the correlations obtained 
did not differ significantly from zero, nor did hi gh 
and low dogmatism groups differ significantly on IAR 
scores. 


Hypothesis four predicted a negative correlation 
between dogmatism scores and creative problem 
solving scores. Since the correlation obtained did 
not differ significantly from zero, no support was 
found for the hypothesis. 

For the predicted negative relationship between 
dogmatism scores and attitudes and self-c oncept 
about creative thinking and problem-solving ( hypoth- 
esis five), limited support was found. Pupils who 
had lower dogmatism scores tended to have higher 
( more favorable ) attitudes toward creative thinking 
and problem solving. There were no significant re- 
jationships, however, between dogmatism scores and 
pupils’ expression of self-concept of creative problem- 
solving ability- 


It would appear that our results offer virtually no 
support for the validity of the Figerttest. Among five 
predictions, we have found significant support for only 
one, and even in that case, the support was limited. 


We must conclude that these data cast serious doubt 
onthe usefulness of the Figert test. Unless other evi- 
dence canbe obtained to provide support for the test’s 
validity or to provide anacceptable alternative inter- 
pretation of our data, it seems necessary to conclude 
that scores on this test should not be interpreted as 
an assessment of dogmatism, aS this construct has 
been defined elsewhere in psychological research. 


1. 


10. 


11. 


12. 


13. 


14. 
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ABSTRACT 


were Following the lead of earlier research om pupil control, the concepts of “humanism”? and “custodialism’’ 
rati used to refer to contrasting types of individual ideology and the types of school organization they seek to 
allonalize and justify. The Pupil Control Ideology Form (PCI) and the Organizational Climate Description 
‘aire (OCDQ) were personally administered by a researcher to virtually all the professional personnel 


uestionn 


ve elementary schools; from this sample, fifteen **humanistic"' and fifteen “custodial?” schools were 


ii orty-fi 

'dentified Usi i i i bet the patterns of social interactions of 
- Using analysis of variance procedures, comparisons between the patte і 

Professional Staff in [simu en and custodial schools revealed statistically significant differences on four of the 

imensions of the OCDQ. In addition, as predicted, humanistic schools were significantly more ‘open’ 


than c 


eight di 
ustodial schools. The results suggested that the pupil control orientation of a school may provide another 


i 
Mportant step in identifying the “social climate” of the school. 
More recent studies of public schools also have 


tions NT ROL IS a problem faced by all organiza- 
nizati ut it is especially important in service orga- 
‘ons which work with people or clients rather 
sped TECH Hoods. Public schools are social units 
Cializ ically vested with a service function, the so- 
ty, zation of the young. Furthermore, they are a 
De of service organization in which neither the or- 
panization nor the clientexerciseschoice concerning 
sepa Sibation in the relationship; that is, public 
(st 0015 have no choice in the selection of clients 
ba ‘dents ), and the client must (in the legal sense) 
ва Шсіраіе іп the organization (3). It should not be 
Ligon tsing that organizations of this type are 
aed to be confronted with some clients who 
ог © little or no desire for the services of the 
leg. nization, afactor which accentuates the prob- 

ù of client control, 

Indeed, there is no lack of opinion or prescription 
th Pupil control in public schools, but uniortinatey 
lis is little systematic study on the subject, rro 
Behe? Study which begins from the perspective 9 bs 
Cu; 001 as a social system. Studies which ha а 
апей Оп the school as a social system have desc pa 
рг ОПЇзНс student subcultures andattendant conte 
се Мет (4, 6,12). For example, Waller ah ihe 
Schr € analysis of the social organization P E 
Du 901 underscored the importance and cen ^u enl 

ЕЦ Control in both the structural and norm 
ts of the school culture. 


mphasized the saliency of pupil control in the orga- 
МЕ Нора! life of schools (9, 10, 11, 13, 14). For 
example, in one study pupil control was described 
as the integrative theme” of the school which gave 
meaning to teacher-teacher and teacher-administrator 
relations. In the words of the researchers, ‘While 
many other matters influenced the tone of the school, 
pupil control was a dominant motif" (14:107). 


differ in terms of the nature of their edu- 
а and policies concerning control 
of students. Some schools are characterized by stress 
on maintenance of order, impersonality, distrust of 
students, and, in general, a punishment-c sutor 
orientation toward students. Other scho A d are 
marked by an accepting, trustful view of ерме a 
and confidence in students to be self-discip. шавал 
sible. Given the apparent significance of pupi 
Болго to what extent are these kinds of differences 
Ж s Ж control orientation related to other important 
di PSP teristics ofschools? This question led to the 
niteat thatteacher-teacher and teacher-principal 
W енеке would be significantly arare іш schools 
with humanistic pupil control orientation than in 


schools with a custodial orientation. 
z 
«CUSTODIALISM"' AND “HUMANISM 


Following the lead of earlier research on pupil 
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control, the concepts of ‘‘humanism”’ and ‘‘custodi- 
alism" were adopted to refer to contrasting types of 
individual ideology and the types of school organiza- 
tion that they seek to rationalize (13). 


In its extreme form, the custodial orientation fa- 
vors a rigid and highly controlled setting concerned 
primarily with the maintenance of order. Students 
are stereotyped in terms of their appearance, behav- 
ior, and parents’ social status. Teachers who hold 
a custodial orientation conceive of the school as an 
autocratic organization with a rigid pupil-teacher 
status hierarchy; the flow of power and communica- 
tion is unilateral downward. Students must accept 
the decisions of teachers without question. Student 
misbehavior is viewed as a personal affront; students 
are perceived as irresponsible and undisciplined per- 
sons who must be controlled through punitive sanc- 
tions. Impersonality, pessimism, and ‘‘watchful 
mistrust” imbue the atmosphere of the custodial 
school. 


The model for the humanistic orientation, on the 
other hand, is the school conceived as an education- 
al community in which students learn through coop- 
erative interaction and experience. Learning and 
behavior are viewed in psychological and sociologi- 
calterms rather than moralistic ones. Self-discipline 
is substituted for strict teacher control. The hu- 
manistic orientation leads teachers to desire adem- 
ocratic atmosphere with its attendant flexibility in 
status and rules, sensitivity to others, open commu- 
nication, and increased student self-determination. 
Both teachers and pupils are willing to act on their 
own volition and to accept responsibility for their 
actions. 


TEACHER-PRINCIPAL INTERACTIONS 


In a major study of seventy-one elementary 
schools, Halpin and Croft (8) identifiedand described 
eight basic characteristics of social interaction be - 
tween the principal and the teachers. Four of the 
characteristics refer to teacher behavior: Disen- 
gagement, Hindrance, Esprit, and Intimacy; and four 
describe principal behavior: Aloofness, Production 
Emphasis, Thrust, and Consideration. The behavior 
described by each characteristic is briefly described 
below (7): 


Disengagement indicates that the teachers do not 
work well together. They pull in different direc- 
tions with respect to the task; they gripe and 
bicker among themselves. 


Hindrance refers to the teachers’ feeling that the 
principal burdens them with routine duties, com- 
mittee demands, and other requirements which 
they construe as unnecessary busy-work. 


Esprit refers to “morale.” The teachers feel 
that their social needs are being satisfied, and 
that they are, at the same time, enjoying a sense 
of accomplishment in their job. 


Intimacy refers to the teachers’ enjoyment of 
friendly social relations with each other. 


Aloofness refers to behavior by the principal 
which is characterized as formal and imperson- 
al. He ''goes by the book" and prefers to be 


guided by rules and policies rather than to deal 
with the teachers in an informal, face-to-face sit- 
uation. 


Production Emphasis refers to behavior by the 
principal which is characterized by close super- 
vision of the staff. He is highly directive and 
task-oriented. 


Thrust refers to behavior marked not by close 
Supervision of the teacher, but by the principal's 
attempt to motivate the teachers through the ex- 
ample which he personally sets. He does not ask 
the teachers to give of themselves anything more 
than he willingly gives of himself; his behavior, 
though starkly task-oriented, is nonetheless 
viewed favorably by the teachers. 


Consideration refers to behavior by the principal 
which is characterized by an inclination to treat 
the teachers ‘“‘humanly,’’ to try to do a little some- 
thing extra for them in human terms. 


In addition, Halpin and Croft (8) conceptualized 
social interactions of professional personnel of 
schools in terms of a more generalfactor, openness. 
The openness of a school refers to actions which 
emerge freely and without constraint; that is, the 
behavior of the group members is genuine or authen- 
tic. Leadership acts are readily initiated from both 
the principal and teachers, and the group is not in- 
ordinately concerned with either task achievement 
or social-needs satisfaction. Satisfaction on both 
counts emerges easily and almost effortlessly. 


The concept of openness in organizational behav- 
ior seems highly compatible with a humanistic pupi! 
control orientation. If pupil control is a salientíea" 
ture of the organizational life of schools, it see ms 
reasonable to further hypothesize that ‘humanistic 
schools will be significantly more open in teacher- 
principal interactions than ‘‘custodial’? schools. 


PROCEDURES 


Instruments 


The PCI Form was the operational measure for = 
pupil control orientation; it consists of twenty Like" t 
type items. Responses are scored from 5 (strongly 
agree) to 1 (strongly disagree): the higher the ove? 
all score, the more custodial the ideology of the Г е 
spondent. 


Examples of items used include: “А few pupil P 
are just young hoodlums and should be treated um 
cordingly.” “It is often necessary to remind pup} h- 
that their status in schools differs from that of tent 
ers." And, ‘Pupils can be trusted to work tog? 
without supervision" (score reversed). 

In earlier research (13), split-half reliability, 
coefficients of the instrument in two samples Weg 
. 95(N-170) and .91(N-55) with application of 
Spearman-Brown Formula. Validity of the me: 


ig 
cantly higher (p <.01 using t-test procedure sudged 
Form scores than a like number of teachers J 

to be most humanistic. 2 


the? 
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between the principal and teachers, only schools with 
principals who were near the completion of at least 
their second year as full-time principals and who 
served in only one building were included in the sam- 
ple. Further, elementary schools were selected 
from various types of communities: rural, town or 
small city, suburban, and urban. 


Originally, fifty schools seemed to meet the se- 
lection criteria and were asked to participate in the 
study. Four schools declined the invitation to par- 
ticipate, and further information excluded another 
school from the sample. The forty-five elementary 
schools that agreed to participate were personally 
visited by a researcher and both the PCI and OCDQ 
were administered to the professional personnel dur- 
ing regularly scheduled faculty meetings. Virtually 
all of the teachers and principals in each school re- 
sponded to the instruments. 


In this phase of the investigation, fifteen relative- 
ly “custodial” and fifteen relatively “һшпапіѕііс” 
elementary schools were identified from the origi- 
nal group of forty-five. Those schools with the high- 
est mean PCI scores were designated as custodial 
schools (range = 52.2 - 61.8) while those schools 
with the lowest PCI scores were termed humanistic 


schools ( range = 45.7 - 52.5). 
RESULTS 


redicted, the examination of the profiles of 
d custodial schools found in Figure 1 


The OCDQ is compos 

items whic mposed of sixty-four Lik 

- scribe ations and Principals may arr ie 
Schools. By fh of social interaction in their 
Subdivided the OCD bee Halpin and Croft (8) 
а correspondin © into eight dimensions each with 
drance, езгі Subtest. Тһе Disengagement, Hin- 
ily to the behavi and Intimacy subtests refer primar- 
Production Em рацо Ше teachers; and the Aloofness 
subtests to th pranie, Thrust, and Consideration 
for analysis 2 ehavior of Principals. Further fac- 
of а general о School profiles ledto the identification 
Schools can bene factor. Openness scores for 
Thrust Sibhssr computed by summing the Esprit and 
engagement E OR es and then subtracting the Dis- 


Findin 
* — Validity asd о humerous studies have supported the 
1,2), For едд ШУ of the eight OCDQ subtests 
Conducted by д, ‘ample, a major validity study was 
Struct validit ndrews (1); using the method of con- 
the Organizat he concluded **, . . the subtests of 
Provide reas ional Climate Description Questionnaire 
Pects of the onably valid measures of important as- 
tive or int School principal's leadership in perspec- 
€raction with his staff” (1:333). 


Sample 


. Бору. 
districts te elementary schools in thirty school 
al criteria mprised the sample of this study. Sever- 
Schools тер теге used in the selection of elementary 
Portunity f Study. In order to allow sufficient 0p - 
Or the development of interaction patterns 
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TABLE 1 


SUMMARY DATA FOR HUMANISTIC AND CUSTODIAL ELEMENTARY SCHOOLS 


SCHOOL 


CHARACTERISTIC MEAN sD 


HUMANISTIC SCHOOLS (N=15) 


CUSTODIAL SCHOOLS (N=15) MEAN 
MEAN sD DIFFERENCE F-RATIO 


Teacher Behavior 


Disengagement 49.93 5.32 
Hindrance 51.93 3.78 
Esprit 50.26 6.55 
Intimacy 51.66 4.10 
r ehay 
Aloofness 46.46 5.54 


Production Emphasis | 49.00 3.62 


Thrust 51.13 4.20 
Consideration 53.66 4.46 


"Climate" 


13.27 


Openness 51.47 


* p.01 


indicates some important differences between the 
climate characteristics of the two types of schools. 
Analysis of variance computations yielded significant 
F ratios for differences between means of humanis- 
tic and custodial schools on Disengagement ( F-9.71, 
р <.01), Esprit ( F-5.82, p <.01), Aloofness ( F= 
26.35, р<.01), and Thrust (F-26.02, p —.01). Тһе 
degree of Intimacy and Production Emphasis was rel- 
atively the same in both types of schools. Teachers 
in humanistic schools experienced slightly less Hin- 
drance and described their principals as more con- 
siderate than those in custodial schools. However, 
the differences between the means were not signifi- 
cant at the .05 level for either of these two dimen- 
sions ( F-1.47 and F-3.41 respectively). These data 


are summarized in Table 1. 


Furthermore, as hypothesized, elementary schools 
with a humanistic pupil control orientation were sig- 
nificantly more open than those with a custodial pu- 
il control orientation ( Е-18.77, p = .01). The rele- 
vant data are also summarized in Table 1. 


Although the present analysis focused on a con- 
trast of schools in the sample with extreme pupil 
control orientation scores (upper third versuslower 
third), it is instructive to note that when coefficients 
of correlation were also computed, using data from 
all forty-five schools in the sample, parallel rela- 
tionships emerged. Average pupil control ideology 
scores of schools correlated significantly with mean 
scores on Disengagement ( r-.40, p < .01), Esprit 
(r=-.49, p <.01), Aloofness ( r=.67, p = .01), Thrust 
(r=-.60, p <.01), and Openness (r=-.61, p <.01). 
Correlations between PCI and Hindrance (r=. 23, 


55.13 3.66 -5.20 9.71* 
53.60 3.73 -1.67 1.47 
45.46 4.05 *4.80 5.82* 
51.00 2.07 40.66 0.32 
54.73 2.86 -8.27 26.35* 
49.53 4.27 -0.53 0.14 
43.33 4.16 47.80 26.02* 
51.26 2.31 42.40 3.41 
33.67 8.79 417.80 18.77* 


p >.05), Intimacy (r--.18, р = .05), Production Em- 
phasis (r-.11, p >.05), and Consideration (г=-.25, 
р >.05) were all not statistically significant (recall 
the higher the PCI score the less humanistic the 
school). 


SUMMARY AND DISCUSSION 


Humanistic schools were found to be different from 
custodial schools in several important ways. Inaddi- 
tion to the basic contrast in orientations toward stu- 
dent control as measured by PCI scores, humanistic 
schools were more likely than custodial schools to 
have: (1) teachers who work well together, that is, 
pull together with respect to the teaching-learning 
task; (2) high morale and satisfied teachers, satis- 
faction growing out of a senseoftask accomplishment 
and fulfillment of social needs; (3) principals who 
deal with teachers in an informal, face-to-face sit- 
uation rather than ‘‘go by the book"; (4) principals 
who do not supervise closely but instead attempt 10 
motivate through personal example; and (5) an at- 
mosphere marked by openness, acceptance, and au^ 
thenticity in teacher-principal interactions. 


. The data suggested that authenticity and opennes? 
in organizational behavior seem highly compatible 
with a humanistic pupil control orientation and in- 
compatible with a custodial orientation. Ш interac” | 
tions among teachers and between teachers and pr 
cipals are authentic in humanistic schools, then it 
Seems reasonable to hypothesize that authentic! 4 
will also tend to pervade teacher-pupil interaction 4 


rf 
А humanistic pupil control orientation would appe? 


7; 
4 


fi 
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to facilitate, and be facilitated by, authentic interac- 
tion between teachers and pupils. 


The importance of the concept of openness in the 
organizational climate of schools has been discussed 
in detail by Halpin and Croft; they posed the interest- 
ing query that perhaps **. . . climate profiles may 
actually constitute a better criteria of a School's ef- 
fectiveness than many measures that already have 
entered the field of educational administration and 
now masquerade as criteria" (8:82-83). Ifthe open- 
ness of the school climate provides one valid crite- 
rion of school effectiveness, then elementary schools 
with a humanistic pupil control ideology would appear 
to be significantly more effective, at least in terms 
of expressive or social emotional development, than 
those with a custodial orientation. 


Moreover, to the extent that an elementary school 
attempts to communicate values as well as to com- 
municate knowledge and develop skills, a humanistic 
pupil control ideology Seems highly functional. A pos- 
itive and strong commitment of students to the school 
seems required to effectively communicate values (5). 
It also appears unlikely that such commitment can be ef- 
fectively attained іп the custodial School: infact, the cus- 
todial atmosphere in the school is more likely to produce 
alienation of students rather than commitment. 


Although humanism and custodialism are desc rip- 
tive terms assigned to contrasting pupil control ori- 
entations in elementary schools, it is difficult not to 
idealize the former through contrast with the latter. 
However, a word of caution seems in order. It is 
one thing to describe humanistic schools in a gener- 
al way, but it is quite a different matter to find teach- 
ers equipped with demonstrably sound psychological 
and sociological theories necessary for the effective 
service of a humanistic approach. Therearenosim- 
ple approaches in changing the climateor atmosphere 
of schools. For example, recent research findings 
suggest that the pupil control ideology of beginning 
teachers becomes significantly more custodial as they 
become socialized by the teacher subculture, a sub- 
culture described by the vast majority o1 new teach- 
ers as one in which good control and good teaching 
were equated (9, 10, 11). More research is neces- 
sary to explore various strategies for changing the 
atmosphere of schools. For example, the study of 
the conflicts and adaptations of humanistic teachers 
attempting to teach in custodial schools and of cus- 
todial teachers working in humanistic schools might 
supply some useful clues in developing such a strategy. 


In brief, the significance of pupil control orienta- 
tion, as an important aspect of the organizational 
life of elementary schools, was underscored by the 
findings of this study. The concepts of custodialism 
and humanism provided a useful means for identify- 
ing schools with important differences in patterns of 
social interaction. If statements concerning orienta- 
tion correspond relatively well with behavior, then 
the pupil control orientation ofa School may provide 
another important step in identifying the ‘‘social cli- 


mate" of the school. 
FOOTNOTES 
1. This research was supported іп part by a 
` grant from the Oklahoma State University 
Research Foundation. 


2. For a complete discussion of the development of 


10. 


11. 


12. 


13. 


14. 


the PCI Form, see Donald J. Willower, Terry 
L. Eidell, and Wayne K. Hoy, The School and 
Pupil Control Ideolo ү, Penn State Studies 
Monograph No. 24, University Park, Pennsyl- 
vania, 1967. 
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ABSTRACT 


The ability of a congruity model to predict com 
differential (SD) questionnaire was examined, The co 


with the field of education. In most instances, obtained 
However, the addition of a constant c, such that -0,3 = c= 
Obtained and predicted factor scores were correlated to indicate their rela- 


predicted measures. 
generally removes this difference. 


tionship independent of a systematic error. After accounting for the reliability of 
indicate that the congruity model does predict meanings of composite signs fro: 


A SEMANTIC differential (SD)is a device 
which may be used to determine the connotative mean- 
ing of signs such as teaching or children, The mean- 
ing of a sign is defined as its point in Euclidean n-space 
with coordinates a,, аҙ,...,а,. Each dimension of 
the meaning space for a sign is determined by a fac- 
tor analysis of Ss’ responses toa set of 7- point Scales 
each defined by a pair of bipolar adjectives such as 
good-bad or hard-soft. Thus two signs having differ- 
ent meanings will be associated with different points 


in the meaning space. 


What happens when two or more signs are present 
together? One might expect the meaning of teaching 
to interact with the meaning of children to educe the 
meaning of teaching children. Osgood, Suci and Tan- 
nenbaum (4: 199-216) developed a model to predict 
the meaning of a composite sign such as teaching 
children from the measured meanings of the compo- 
nent signsteachingandchildren. Their model is: 


а а 
+ tah) +a, + ? а (dz) 


des d, 
where d is the deviation from neutrality on SD scales 
(i. e. the location on a 7-point scale from -3 to +3), 
с reters to the composite sign, and 1 and 2 refer to 
the first and second component signs respectively. 
The model is to be applied separately to each dimen- 
sion of the meaning space, Osgood, Suci, and Тап. 
nenbaum (4:275-284) cite evidence to Support the 
predictive power of their model, They report that: 


posite sign meaning as defined by responses to a semantic 
posite signs, component signs, and Ss were associated 
measures of factor scores were systematically lower than 


-0, 2, tothe predicted measures 


SD factor scores the correlations 
m meanings of component signs. 


(1) obtained factor scores for composite signs are 
consistently within the limits set by the factor scores 
of the components; (2)obtained factor scores devi- 
ated from the predicted scores on the average only by 
amounts attributable to unreliability except for fac- 
tor I, the evaluative factor; (3) obtained and predict- 
ed factor scores exhibit a high positive correlation. 
They concluded that semantic effects follow the ех- 
pectations from the Congruity principle quite closely 
for the average meaning of composite signs. 


This study was designed to determine whether ОГ 
not the principle of congruity predicts com posite 
sign meaning with component signs, composite signs, 
and Ss from elementary education, 


METHODS 


The Ss were seventy-one seniors major bir 
elementary education at Purdue University and ent bi- 
ed in their professional semester, Fourteen the 
polar adjective scales were chosen by searchin’ ed 
literature for SD scales which consistently вл of 


high and relatively pure loadings across а var ліз 
Signs judged by different kinds of Ss. А SD CO% 16 
site S at- 


ing of five component signs and four compo res 
each to be rated on the fourteen scales was апета!" 
ей to each S, The component signs егес and | 
ics, social studies, science, language S weretent" 
teaching children, The composite signs n 
ing children mathematics, teaching childre 
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TABLE 1 
SD SCALES ASSOCIATED WITH EACH FACTOR 


Factor I Factor П Factor III 
happy-sad heavy-light fast-slow 
good-bad hard-soft hot-cold 


heavenly-hellish difficult-easy 
positive-negative 


optimistic-pessimistic 


studies, teaching children science, and teaching 
children language arts. The order of sign andscale 
presentation was randomized as was the order of adj- 
ectives within scales. In-class time was used to 
administer the questionnaire; every S completed ev- 
ery item. Principal components factor analysis with 
rotation to Kaiser’s varimax criterion (2,3) 
revealed that three factors accounted for 0.50 to 
0.75 of the variance across scales amongthe 
nine signs. Table 1 lists the scales whose loadings 
were greater than 0. 3 on their respective factors for 
at least seven out of nine signs for factors I and II. 
and for at least six out of nine signs for factor III. 
The remaining four scales were discarded since they 


were confounded across factors. 


FINDINGS AND ANALYSIS 

Scores for each S across the nine signs werecal- 
culated by computing mean scores for the set of SD 
scales within each factor. Predicted scores for each 
of the four composite signs were computed using the 
congruity model. Mean obtained and predicted 
scores over Ss are presented in Table 2. 


A Z test for correlated data was used in compar- 
ing obtained and predicted means of composite 
concepts for factor I because the variances among 
scores for composite signs on factor I were not ho- 
mogeneus. These means were significantly different 

(а<0.01). Since homogeneity of variance obtained 

among scores for composite signs on factors II 

and III, t tests for correlated data were used to 
analyze scores for these factors. Six out of eight 
differences were significant at &«0, 05. Thealpha 

evel for each difference is displayed in Table 2. 


]t appears that the predictive power ofthe con- 


TABLE 3 


OBTAINED MEAN COMPONENT SIGN SCORES 
OVER Ss 


Obtained 


Factor I Factor ІІ Factor III 


Language Arts 1.70 47. ‚63 
Mathematics 1.90 .77 .95 
Science 1.67 .68 .31 
Social Studies 1.23 .35 ‚34 
Teaching Children 2.25 887 % .69 


a ae УЗО м, 


gruity model is somewhat stronger with factor II 
Scores than with scores for factors I and Ш. In 
fact, the differences between obtained and pre- 
dicted scores for factors I and III are significant at 
the 0. 01 level in all but one case, Moreover, the 
predicted scores are consistently higher thanthe 
obtained scores for factors Тапа ІП, If a constant 
of about -0.3 were introduced into the predictionfor- 
mula the differences between predicted and obtain- 
ed scores for factors I and III would virtually dis- 
appear. The insertion of a constant of -0. 3 would 
decrease the predictive ability of the formula inonly 
one case among the factor II scores. 

To obtain a different measure of the predictive 
validity for the congruity formula, mean component 
sign scores over Ss for each factor were calculated. 
These scores are displayed in Table 3. 


Predicted means for the composite signs were 
computed by substituting the mean scores for the 
component signs into the congruity formula. Table 
4 includes these predictions together with theobtain- 
ed means for the composite signs. 

Using t-tests for correlated data six of the differ- 
ences between predicted and obtained mean scores 
are significant at the 0.01 level. The alpha levelfor 
each difference is displayed in Table 4. The pattern 
of differences between obtained and predicted scores 
when the predicted scores are generated from mean 
Scores from component signs is quite similar tothe 


TABLE 2 
MEAN FACTOR SCORES FOR FOUR COMPOSITE SIGNS OVER Ss 
Factor I Factor II Factor III 
Obtained Predicted Obtained Predicted Obtained Predicted 

Teaching Children Language Arts 181 210 (а<.01) .67 .55 (а<.40) .39 .69 (a<.01) 
Teaching Children Mathematics 1, 60 2.04 (ac.01).78 .99 (а<.10) .32 .68 (a< (01) 
Teaching Children Science 1, 85 2.12 (a<.01) .61 .85 (о< .05) .58 .91 (а< .01) 

1.60 1.99 (а<.01) .18 .49 (ac .02).40 .66 (а<.02) 


Teaching Children Social Studies 
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TABLE 4 


PREDICTED AND OBTAINED MEANS FOR COMPOSITE SIGNS 


Factor I 


Obtained Predicted 


Factor II 
Obtained Predicted 


Factor IIT 
Obtained Predicted 


Teaching Children Language Arts 1.81 201 
Teaching Children Mathematics 1.60 1. 90 
Teaching Children Science 1.85 2.00 


Teaching Children Social Studies 1. 60 1. 89 


(а<.01) .78 
(®<.05) .61 


(а <.01) .18 


(@<.01) .67 .40 (a<.01) .39 .66 (a<.01) 


.68 (а<.40) .32 .84 (a<.01) 
.99 (04.50) ,58 „51 (а<.50) 


.29 (а<.40) .40 .97 (а<.05) 


pattern observable in Table 2. Prediction of factor 
II scores is better than prediction of factor I and III 
Scores. In factors I and III, the predicted scoresare 
higher than the obtained scores in all but one case. 
If the constant -0.3 were inserted in the congruity 
formula, predictions would be improved in six out of 
twelve cases. Predictions would be improved in eight 
out of twelve cases if the constant were -0.2. The 
data summarized in Tables 2-4 indicate that predic- 
tions of mean scores within factors based on thecon- 


gruity formula may be improved by adding a constant. 


Product- moment correlation coefficients between 


obtained and predicted scores over Ss werecomputed. 


These data, presented in Table 5, give an indication 
of the relationship between obtained and predicted 
scores which would remain invariant if a constant 
ег с added to each predicted score. 


Test-immediate retest reliabilities of factor scores 
for seventh grade Ss were 0.84 for factor I, 0.72for 
factor II, and 0.69 for factor III (1). While these 
coefficients might be expected to be somewhat high- 
er for adult S's some of the correlations reported in 
Table 5 appear to be pushing their upper bound. АП 
but the correlation for factor III under teaching chil- 
dren science are respectably high. 


CONCLUSIONS AND RECOMMENDATIONS 


The ability of a congruity model to predict com- 
posite sign meaning as defined by responses to a 
semantic differential questionnaire was examined. 
The component signs, composite signs, and Ss were 


all associated with teaching in the elementary school, 


TABLE 5 


CORRELATIONS BETWEEN OBTAINED AND 
PREDICTED COMPOSITE SIGNS OVER Ss 


Teaching Children Factor I Factor П Factor III 


Language Arts . 676 ‚185 . 593 
Mathematics .550 2587 2749 
ӛсіепсе . 608 . 615 ‚312 
Social Studies . 905 . 565 .519 


There were seventy-one Ss each enrolled in a pro- 
fessional semester for prospective elementar y 
School teachers. 


Two avenues of analysis were followed. First,a 
series of tests of differences between predicted and 
obtained measures of factor scores was completed. 
These data revealed a trend toward obtained mea- 
sures being systematically lower than predicted! 
measures. Thus, whilethe prediction model failed to 
‘‘hit the mark, '' the adjustment of adding a constant, c, 
such that -0.3 = c — -0.2, to the predicted mea- 
sures would have improved its marksmanship. Sec- 
ond, obtained and predicted factor scores were cor- 
related to indicate their relationship independent of 
a systematic error such asthe one described above. 
After accounting for thereliability of SD factor scores 
the correlations indicate that the congruity model 
does predict meanings of composite signs from 
meanings of component signs. 


Additional research should confirm orrefine 
the estimate that -0.3 = c = -0.2isan optimum 
constant to use in revising the model for use with 
signs and Ss from the field of education. 


FOOTNOTE 


l. The work reported herein was performed pursuant 
to a grant from the U. S. Office of Education, De- 
partment of Health, Education, and Welfare. 


REFERENCES 


l. DiVesta, F.J.; DiVesta, Dick W. , “Тһе Test- 
retest Reliability of Children's Ratings on The 


Semantic Differential," Educational and Psy- 
chological Measurement 26:605-616, 1966. 


2. Kaiser, H. F., “The Varimax Criterion for An- 


alytic Rotation in Factor Analysis, '' Psychomet- 
rika, 23:187-200, 1958. ES 


3. Kaiser, Н. F., “Тһе Application of Electronic 
Computers to Factor Analysis,” Educationaland 


Psychological Measurement, 20:141-151, 1960. 


4. Osgood, C.E.; Suci, G. J. ; Tannenbaum, P.H. s- — 


The Measurement of Meaning, University of Il- 
linois Press, Urbana, 1957. 


THE JOURNAL OF EXPERIMENTAL EDUCATION 
(Volume 39, Number 2, Winter 1970) 


READING GROUPS AS PSYCHOLOGICAL GROUPS 
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ABSTRACT 


Six classes of first-grade children were given a sociometric question asking them with which classmates they 
would prefer to work. The children were classified by their reading groups, and their in-group and out-group 
choices were analyzed. Each class was naturally trichotomized into three reading groups; top, middle, and lower, 
The reading groups were the only enduring groups of the classes. It was found that the lower reading group mem- 
bers chose fewer than expected children from their own groups (p^ . 025) and more than expected children from 
the top reading groups (p^ . 001). Members of the middle reading groups made fewer than expected choices from 
the lower reading groups (p~.01) and more than expected choices from the top reading groups (p^ . 001). The 
top reading group members made fewer than expected choices from the lower reading groups (p= . 001), fewer 
than expected choices from the middle reading groups (p< .06), and chose within their own groups more than ex- 
pected (p= . 001). Theresults were discussed in terms of group cohesiveness and possible group effects upon learning. 


CHILDREN can have, within the schoolframe- 
work, many and varied group experiences. Generally, 
school groups have been studied in terms of the cri- 
terion used for grouping and the resulting effects of 
the grouping on learning, and in terms of a group’s 
psychological properties and their effects on individ- 
ual behavior. 


The educational researcher generally studies 
grouping procedures to determine their effectson 
learning, A number of the studies in groupinghave 
involved reading groups; these studies usually use 
the skill of reading as a dependent variable and the 
grouping criteria as the independent variables. For 
example, the educational researcher has been con- 
cerned with the effect ability grouping has on read- 
ing development (18). Within this context, the defini- 
tion of a group is a collection of individuals with some 
degree of homogeneity. The focal point of this type of 
research is on the verb form of the word group which 


is to arrange or to form. 


The psychologist who studies groups is usually 
concerned with consistent and persistent behavior pat- 
emerge within an interacting group. Be- 
ns which consistently emerge in groups 
sist through the life of quus are com- 

red to as being properties of the group. 
ordered more common group properties thathave 
been delineated are cohesiveness, conformity, lead- 


terns which 
havior patter 
and which per: 


ership, and status. Within this context, the defini- 
tion of a group is a collection of interacting individ- 
uals. As a result of the interaction, each member is 
changed by his group membership, and each member 
would probably undergo a change as a result of changes 
in the group (6). Of these group properties, Deutsch 
and Krauss (7) view cohesiveness as one of the most 
significant. Bonner (4) not only feels that cohesive- 
ness is a basic group property, but feels that without 
at least minimal attraction among group membersa 
group could not exist at all. In general,the group 
property of cohesiveness can be considered to be one 
of the fundamental dimensions of interpersonal attrac- 
tion. Once a group is formed through interaction and 
interpersonal attraction, other group properties de- 
velop and group consequences emerge that effect each 
individual within the group. Lott and Lott define co- 
hesiveness as: “That group property which is 
inferred fÉrom the number and strength of mutual 
positive attitudes among the member s ofagroup” 


(13: 408). 


Although the school class has been studied as a 
psychological group in order to describe its social 
structure (5, 10), instructional groups suchas reading 
groups have usually been studied іп terms of how their 
formation criteria affect reading development e. g. , 
(15). However, these instructional groups have not 
been studied as social structures with behavior 
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patterns which may be affecting reading development. 


The purpose of this study was to determine if read- 
ing groups develop into groups in the psychological 
sense. Since cohesiveness is recognized as a signif- 
icant factor in the development and continuation of 
groups, a cohesiveness measure was used to deter- 
mine whether or not reading groups develop into psy- 
chological groups. If a homogeneous collection of in- 
dividuals in a reading group show positive attitudes 
toward each other by choosing each other for an ac- 
tivity, in preference to individuals in other reading 
groups, then the reading group may be postulated to 
be cohesive, and hence a psychological group. If 
individuals in the reading group choose individ- 
uals from other reading groups in preference to the 
members of their own, then it may be postulated that 
the reading group is not cohesive and therefore, not 
a psychological group. 


METHOD 
Subjects 


A sociometric test was administered to six first- 
grade classes from two elementary schools located 
in Lexington, Kentucky. The schools were located in 
similar socioeconomic areas. One school was in an 
older middle-class residential area and the other in 
a newer middle-class suburban residential area. The 
classes were chosen by the school principals to meet 
the following criteria: a self-contained classroom; 
experienced teachers (suggested by the principals ); 
average children who had little mobility in and out of 
school; relatively stable reading groups; and classes 
with the same number of reading groups. The class 
sizes ranged from twenty-three to twenty—nine with 
a total of 160 first graders. 


In each class the children had been divided into 
three reading groups by the ability criterion. Group 
1 was the high ability group, group 2 was of average 
ability, and group 3 was the low ability group. Each 
class had special names for the reading groups,which: 
ranged in size from five to thirteen. 


Out of the total number of children, one boy was 
a repeater, one boy had severly retarded speech, and 
one girl was mentally retarded (the teachers helped 
give the answers for the last two children). All but 
two of the children had been in their present classes 
for at least one semester. Several children were ab- 
sent the first day of testing but were seen later inthe 


week. 


First graders were selectedfor this study for 
three reasons: (1) the classes are usually self—con- 
tained in the first grade; (2) the reading group isusu- 
ally the only instructional group (in other grades math 

oups and other special activity groups are formed); 
and (3) first graders have had relatively little other 
formal group experiences, e. g. Brownies, CubScouts, 


etc. 


Procedure 
Procedure 


Each of the children was individually asked the 

ing Sociometric question: | 
ur her'sname) was goingto have you work with 
Fe other children inthis class, whichthree chil- 


dren would you like to work with? 


The authors chose a sociometric question which 
employed the concept of work in order to convey a 
School activity as opposed to a play situation. The 
word work was not thought to be leading as opposed 
to other words whichrelateto specific concepts such 
as read. 


The sociometric question was administered near 
the end of the children's first year of school. Each 
child was seen individually, outside the classroom. 
The question was given in the afternoon so as not to 
interfere withthe morning reading program.Theteach- 
er introduced the interviewer and told the children 
they were going to be asked several questions about 
the class. The children were instructed to meet with 
the interviewer one at a time. The interviewer chat- 
ted with each child for a few minutes in order to es- 
tablish rapport. The main sociometric question was 
asked, then the child was asked why he(she) chose 
each of the children he (she) named. Each child was 
thanked and then asked to tell the next child to come 
out. Each child was seenfor approximately 5 minutes. 


In addition to the sociometric question for the chil- 
dren, the teachers were asked (1) for the names of 
the children in each reading group and if any of the 
children had been recently changed from one group 
to another, (2) if they divided the class for any other 
enduring group activity with which the children might 
identify, and (3) which five children she felt were 
best in reading, Sports, math, or were the most pop- 
ular children, and any other informationabout the 
children she cared to share withthe investigator. The 
last question was asked to determine the teachers'at- 
titudes towardthechildren, and to determine whether 
the teacher saw the children as they saw each other. 


RESULTS 
Expected Choice Patterns 


The first question asked in analyzing the data was: 
“Do the choices follow the expected pattern?" The 
expected values were the chance number of choices 
which would be received by members of each group 
if each group member had an equally likely chance of 
being chosen. The children's three choices were ar- 
ranged into sociograms and tallied for chi—square 
analyses. * Table 1 lists the obtained (f) number of 
choices and the expected (F) choice frequencies for 
the reading groups of each of the six classes (School 
А, classes 1 through 3; and School B, classes through 
3), the overall chi—square for each class, and the 
overall chi—square for the total of the six classes. 
Five of the chi—squares were significant at beyond the 
Р 001 level. The chi—square for the sixth class was 
significant at the . 005 level. The overall chi—square 
was also significant at beyond the ‚001 level. The over- 
all chi—squares computed separately for School Aand 
School B showed the same results, 


Intragroup Choices: Cohesiveness 


The second question asked in analyzing the data 
was “ Were the reading groups cohesive ?’? Table 2 
Shows the chi—squares for the three reading group lev- 
els. The overall chi—square for the top reading group В 
indicates that these groups made more (p<. 001) 


] 
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TABLE 1 
OVERALL CHI-SQUARES FOR THE SIX CLASSES" 


“Class Group f F (t-F)?/F 
w {# Gl 40 21 = 17.19 
Al G2 16 24 = 2.67 x? 23.53** 
G3 22 33 = 3.67 
Gl 68 39 = 21.56 
A2 G2 14 27 = 6.26 х? 40.01** 
G3 5 21 - 12.19 
G1 38 24 = 8.17 
A3 G2 31 33 = 0.12 x? 13.09* 
G3 18 30 = 4. 80 
G1 50 27 = 19.59 я 
ВІ G2 14 21 = 2.33 х 30.45% 
G3 14 30 = 8.53 
G1 48 33 = 6. 82 * 
в2 G2 16 21 = 1.19 x  14.68** 
G3 5 15 - 6.67 
“ @1 43 24 = 15.04 
B3 G2 16 24 - 2.67 x? 21.38** 
ы G3 22 33 = 3.67 
b 
Overall chi-square х? 143.14 
pdt 2 
For the overall chi-squaredf 12 and p ^ . 001. 
*p = .005, жжр = ‚001. 


intragroup choices and less intergroup choices than 
expected. The overall chi—square for the lower read- 
ing groups indicates that these groups made fewer 
intragroup choices and more intergroup choices than 
expected. The overall chi-square for the middle read- 
ing groups was nonsignificant but in the negative direc- 
tion (p^ . 11, signtest). 


Intergroup Choices 


The third question asked in analyzing the data was 
«Та what direction were the intergroup choices made?” 
Table 3 shows the expected (F) and obtained (f) in- 
tergroup choices. The top reading groups madefew- 
er intergroup choices in the middle group (p= . 06) 
and in the lower group (p^ .001)than expected. The 
middle groups made more intergroup choices in the 
top reading groups (p< .001) and fewer intergroup 
choices in the lower reading groups (p^ . 01) than 
expected. The lower reading groups made more in- 
tergroup choices than expected in the top reading 


groups (р. 005). 


Summary of Choice Patterns 


The total results of the above three questions are 
Figure 1. The choices of the lower 


arized b 
Ese : number 3) can be described 


reading groups’ members ( 


as follows: 
members did not choose members within 
à pa own groups (S- ). This effect was significant 


at the . 025 level. 


2. The members chose inthe middle reading groups 
slightly less than expected (NS- ): 


3. Тһе members chose in the topreading groups more 
than expected (S+). This effect was significant at 
the . 001 level. 


The choices of the middle reading groups’ members 
(number 2) were as follows: 


1. The members made fewer than expected choices 
in the lower reading groups (S- ). This effect was 
significant at the . 01 level. 


2, The members chose within their own groups slight- 
ly less than expected ( NS- ). 


3. The members chose inthe topreading groups more 
than expected (S +). This effect was significant at 
the. 001 level. 


The choices of the top reading groups’ members (num- 
ber 1) were as follows: 


1. The members did not choose inthe lower reading 
groups (S- ). This effect was significant at the 
. 001 level. 


2. The members did not choose in the middleread- 
ing groups (S-). This effect was significant at the 
. 06 level. 


3. Thememberschose within their own groups more 
than expected (S +). This effect was significant 
at the . 001 level. 


From the results, it was concluded that the top 
reading groups were cohesive and the lower reading 
groups were not. In the middle reading groupsthere 
was a strong trend away from cohesiyeness infavor 
of intergroup choices, although the data were incon- 
clusive for a decisive statement about cohesiveness. 
Since the middle groups also made significant- 
ly more choices than expected in the topread- 
ing groups there is evidence in support of the 
assumption that the middle reading groups 
were not cohesive. The Top reading groups, 
using the measure of cohesiveness, were the 
only psychological groups. 


FIGURE 1 
DIRECTIONS OF INTER— AND INTRAGROUP CHOICES 
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TABLE 2 | 
CHI-SQUARES FOR INTRA-/INTER-CROUP CHOICES | 
== == сс с=з jud с UOO 
G1 in 11 5.04 = 7.05 | 
out 10 15.96 = 2.23 xy? 9.28 ағі р <.005 
Al G2 in 3 6.72 = 2.06 
out 21 17.82 = 80 х2 2.86 ағ NS 
G3 in 10 13.20 = ‚78 
out 23 19.80 = 452 32 1,35 ағ NS 
G1 in 32 16.71 - 13.99 
out 7 22.29 = 10.49 y? 24.48 ағ р «.001 
A2 G2 in 4 7.71 = 1.78 
out 23 19.19 - „71 x? 2,49 dfi NS 
G3 in 2 4.50 = 1.39 
out 19 16.50 = 0.38 Y? 2.77 dfl NS 
Gl in 11 6.00 = 4.17 
out 13 18.00 = 1.39 xy? 5,56 dfl р «.025 
A3 G2 in 11 11.79 = 53 
out 22 21.21 = ‚29 3? 0.82 ағ! NS 
G3 in 3 9.64 = 4.57 
out 27 20.36 = 2.17 х2 6.74 ағ р <.01 
сі іп 22 8.64 - 20.66 
out 5 18.36 = 9.72 y? 30.38 dfl р <.001 
Bl с? іп 4 5.04 = 21 
out 17 15.96 = .07 x? 00.28 ағі NS 
G3 in 7 10.80 - 1.34 
out 23 19.20 - .75 x? 2.09 ағ NS 
G1 in 26 15.00 = 8.07 
out 7 18.00 = 6.72 x? 14.79 ағ р <.001 
B2 G2 in 7 5.73 = .28 
out 14 15,97 = 1l x? 0.93 ағ NS 
G3 in 3 2,73 = .0256 
out 12 12.27 = .0057у2 .0313 ағ1 NS 
Gl in 9 6.46 = .998 
out 15 17,54 = .367 x? 1.365. an NS 
23 62 in 4 6.46 = .936 
out 20 17.54 = 344 X? 1.28 ағ NS 
G3 in 8 12.69 = 1.73 
out 25 20.31 = 1.08 y? 2.81 ағ NS 
chi-square - Top Reading Groups ? 85.85 d «.001 
Overall Middle Reading Groups ха 5.80 um PU NS 


Lower Reading Groups у? 15.67 df6 р<.025 


| 
В 
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TABLE 3 
INTERGROUP CHOICES 


G1 Selecting G2 


G1 Selecting G3 


G2 Selecting G1 


i F  ((-Fy/F È F (-F)?/F Ў Е (-Е)?/Е 
Al 5 6.72 = .44 5 9.24 14 = 
А2 5 12.54 4. 53 2 9. 75 22 fae > 2220 
АЗ т 9.43 263 6 8,57 13 9.43 1.35 
Bl 4 7.56 1.68 1 10.80 11 7.56 = 1.56 
B2 6 10. 50 1.93 1 1.50 13 10.50 = 60 
B3 4 1.39 = 1.55 11 10.15 17 7.39 -1447 
x? =10,76 @5 x? x? -83.01 «5 
p.06 p=. 001 
G2 Selecting G3 G3 Selecting G1 G3 Selecting G2 
f F — (-F)/F f F (F/F Н F Err 
Al Т 0.56 = 1.20 15 9.24 = 3.59 8 10.56 = „62 
А2 1 6.15 = 4.90 14 9.75 = 1.85 5 G5 = 2% 
АЗ 9 11.78 = .66 14 8.57 = 3.44 13 11.79 = 112 
Bl 6 8.40 = .69 17 10.80 = 3.56 6 8.40 = (69 
B2 1 4.77 = 2.98 9 7.50 = .30 3 4.77 = 166 
B3 3 10.15 = 5.04 17 10.16 = 4.61 8 10.16 = 45 
x? = 15.47 «45 x? =17.35 5 x? = 2,76 5 
p-.01 p 7.005 NS 


Reasons for Making Choices 


Table 4 lists some of the children's reasons for 
making their choices. The terms friend and like were 
the two reasons most often given for the children's 
choices. It was felt that the terms friend, like, and 
nice were used almost synonymously; for example, 
“T like him,” ** He is nice, ” and “Не is my friend." 
Some children used each term for each choice sug- 
gesting that when asked why they made their choices 
they felt they should not use the same term for each 
child. The term workranked third and referred to a 
skill such as reading, writing, arithmetic, or color- 
ing. The term play,the fourth ranked term, referred 
to “ He plays with те, ” “І play with him, " ог“ We 
play together. ’’ The term like was attached to most 
of the work and play answers; for example, ** I like to 
play with him,” “ I like his work, ” “ He likes to play 
with те,” or ** We like to play together. ” 


The category of other was used for choices such 
as: “ The teacher thinks she (the chosen child ) is 
smart," “Не loves me, ” “ She kisses me,” “ Не 


helps me catch the girls, ”’ etc. 


Questions for Teachers 
Questions tor = eacners 


The teachers were asked four questions. Ques- 
tion 1 asked if the reading groups had been stable dur- 
ing the year or if the children had been moved from 
one group to another. The six teachers said the ir 
reading groups had been relatively stable, particular- 
ly for the 2 months prior tothe study. In general, the 
changes made inthe reading groups were made in the 
month of January, when two to four children w ere 
moved between the top and middle groups. 


Question 2 asked if there were any other endur- 
ing group activities within the classroom. Theteach- 
ers said that the reading groups were the only group 
activity organized by them that lasted over a 1-week 


period. 


Question 3 asked the teachersto give the 
names of the fivechildrenthey felt were 
the best in reading, math, sports, and the 
most popular in the class. 


Four teachers gave five choices. Two of them 


TABLE 4 


THE STATED REASONS FOR THE CHILDREN'S 
CHOICES 


Reasons Number Reasons Number 
Choosing Choosing 
Friend 107 Helps me 25 
Like 73 Good 12 
Work 52 Cute-pretty 10 
Play 48 Needs help 9 
Nice 47 Neighbor 9 
Don’t know 39 Other 49 
TOTAL 480 
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gave the names of children whom they listed at the 
top of the list of children in the top reading group. 
One teacher only named one child and the sixthteach- 
er did not wish to rank the children. Fifteen of the 
twenty-one children who were named were mem- 
bers of the top reading groups. 


DISCUSSION 


The main question of the study,“ Are reading 
groups psychological groups, that is, do they have 
the group property of cohesiveness?’ has a split 
answer. The top reading groups were cohesive, there- 
fore they were psychological groups. The middle and 
lower groups were not cohesive, therefore they were 
not, by our definition, psychological groups. Two ques- 
tions immediately come to mind when viewing these 
results: ** Why were the middle and lowerr eading 
groups different from the top reading groups intheir 
choice patterns?’’ ‘* What is the meaning or relevance 
of these results to reading development?’’ 


For a full answer as to why there was a differ- 
ence between the top reading groups and the middle 
and lower reading groups, further research must be 
conducted. However, speculative explanations canbe 
made by applying the findings of small groupresearch; 
for example, the research concerned with success— 
reward as antecedents of liking may be pertinent. 


The success—reward antecedents of liking can 
be used as possible explanations why the members of 
the top reading groups like (choose) each other, and 
why the members of the lower reading groups tend 
not to like each other but like the top reading groups’ 
members. Lott and Lott (12) suggest that success— 
reward influences liking in several ways:as anattrac- 
tion to successful persons, as an attractionto persons 
sharing a successful experience, as an attraction to 
persons present ina success—reward situation, and 
as an attraction to the source of the reward, The last 
influence, attraction to the source of the reward, may 
be a reciprocal success—reward situation between the 
teacher and her pupils. Reciprocal liking can influ- 
ence the atmosphere of an interpersonal situation to 
the extent that interaction develops. Both atmosphere 
and interaction can influence the development of lik- 
ing. Success—reward is based on both the classical 
conditioning and operant conditioning learning models. 


Attraction to Successful Persons. Gilchrist (8) 
found that successful persons tend to be chosen by both 
successful and unsuccessful persons. Berkowitz (3) 
found that partners who were successful in the task situ- 
ation were liked morethan partners who were unsuc- 
cessful in the task situation. 


The children in the top reading groups were suc- 
cessful persons because they were excelling inaskill 
(task) that is a basic learning skill, andthat has 
great value in our society. The skill of reading is a 
basic learning skill because most other school learn- 
ing is based on it. Children’s advancement in school 
18 based on achievement, and tests of achievement are 
usually printed and are therefore indirect measures 
of reading ability. 


Since the children in the top reading groups were 
successful, they were chosen more by the other chil- 
dren, both those who were successful and those who were 


unsuccessful in reading. 


Attraction to Person Sharing a Successful Experience. 
Shelley (19) found that interpersonal liking wasgreat- ~ 
er in groups experiencing a group successthan in 
groups experiencing a group failure. The members 
of the top reading groups were sharing in a success- 
ful situation and therefore, they liked each other.The 
members of the lower reading groups were in an un- 
successful situation and were not attracted to one an- 
other (tofailures). 


Attraction to Person Present in a Success—Re- 
ward Situation. Lott and Lott (11) found that children 
developed liking for individuals present when reward 
was received. The children did not need to share in 
the reward directly, they just needed to be present 
when a child was rewarded. 


Since the members of the top reading groups were 
successful in reading, each child was probably reward- 
ed when he participated in reading.The other children 
who were present when the individual child was re- 
warded would be liked more than the children not pre- 
sent (other reading groups’ members). 


Attraction to the Source of the Reward. Persons 
tend to like the person (or persons) who give rewards 
(20). In the classroom this can become a reciprocal 
Success-reward situation betweentheteacher and the 
children in the top reading groups. The members of 
the top reading groups read well, and the teacher re- 
wards them. The children were attracted tothe teach- 
er (the source of their reward). The children’s lik- 
ing for the teacher could be a reward to the teacher, 
and she could be attracted to the children in the top 
reading groups (the source of her reward). 


The teacher, feeling she was successful because 
the children were doing well and because they liked 
her, may have created a more relaxed atmosphere in 
the reading groups. A relaxed atmosphere can influ- 
ence the amount of interaction which, in turn, can in- 
fluence the development of interpersonal liking. 


Implications of Success—Reward for the Middle 
and Lower Reading Groups. If the preceding interpre- 
tations are plausible, then it can be speculated that 
isuccess-reward values well decrease as ability d e- 
creases. This would explain why the middle reading 
groups had slightly fewer in-group choices than ex- 
pected, and why the lower reading groups made few in- 
group choices. The middle groups were experiencing 
a degree of success-reward andwere expressing 
some degree of liking for the teacher. The atmosphere 
might not have been quite as relaxed as for the top 
reading groups but might have been enough for some 
degree of interaction to occur, and for some liking to 
develop. 


Thelower reading groups werenotsucceeding in 
reading in relation to the rest of the class. They knew 
they were poor in reading; the teacher did not reward 
them enough; their peers treated them as unsuccess- 
ful readers; their parents might not haverewarded 
them and might have even, unwittingly and subtly,pun- 
ished them. In general, the situation was negative and 
one to be avoided. The teacher was not rewarding 
these children, and they were not rewarding her.The 
atmosphere might have been so strained that inter action 
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was reduced. Liking was probably at a minimum be- 
cause of these negative forces. 


Status in the Reading Groups 


Status influences liking as either status similar- 
ity or as status dissimilarity. People in high status 
positions tend to like people with status similar to 
themselves. People who feel they belong to a lower 
status group tend to orient upward, and choose the 
high status individuals (9). 


The members of the top reading groups were high 
status individuals in the class (in a relational sense 
to the two lower levels). The top reading groups were 
cohesive since high status individuals tend to choose 
one another, possibly recognizing high status simi- 
larity as well as similarity in reading ability and in- 
terests, The members of the middle and lower read- 
ing groups, being lower status individuals in theclass, 
tended to orient their choices to the upward status po- 
sitions. 


The status and success—reward factors can com- 
bine into a set of interdependent forces which influ- 
ence the development of liking. Low status readers 
might have chosen high status readers in order to iden- 
tify with them, or because high status persons could 
give rewards or were receiving rewards. 


Ability Grouping 

The development of liking has been interpreted 
primarily in terms of the dynamics of the success— 
reward relationship and the status factors between 
the children, There was another factor, a very basic 
one, that may have influenced the development of the 
success-reward, status and liking factors. Thisfac- 
tor was the criterion used in grouping for reading 
ability. Consideration needs to be given to the initial 
effects of placing children into groups by the ability 
criterion, especially when one considers the work of 
Robert Rosenthal. 


Rosenthal and Lawson (17) randomly assigned 
rats toresearch assistants for several learning tasks, 
Some researchers were told that their rats were 
smart, while other researchers were told their rats 
were dumb. The group of supposedly smart rats 
learned better than the group of supposedly dumb rats, 
Rosenthal concluded that the experimenter’s expec- 
tancy toward the rats affected the rats learning. 


More recently, Rosenthal and Jacobson (16 )һауе 
studied teacher expectancy toward children's perfor- 
mance. The experimental group was a random sam- 
ple of first graders. The teachers were told that 
these children were late developers, and would sud- 
denly improve their work performance. Rosenthal 
found that after à 2-year period, the children in the 
experimental group had made greater gains thanthe 
children in the controlgroup. He concluded that the 
teacher's expectancy for the children's improvement 


affected her attitudes which, in turn, influenced the 


children's learning. 


ч 4 ility 

enteachers group for reading by using the abi. 
he there is a strong indication that they ex- 
ES t the children in the top reading groups to learn 
poter and better than the children in the lower read- 
а: 


ing groups. That is, the connotation of ability group- 
ing is parallel to Rosenthal’s rat study in that the top 
reading groups are considered ‘‘smart’’ children, 
and the lower reading groups ‘“‘dumb’’ children. The 
teachers’ choices in this study, of the ‘‘best’’ chil- 
dren in class coming from the top reading groups is 
some evidence for this statement. Also, five of the 
six teachers gave negative indications of liking for 
the children in the lower reading groups. 


Not only do the teachers expect the top reading 
groups to be better and the lower reading groups to 
be poorer or slower but parents reflect this attitude, 
and the peer groups express a similar attitude. The 
children respond to this expectancy and develop ac- 
cordingly. Mann (14) asked fifth graders to describe 
themselves and got answers like: ‘І ат in the low 
fifth grade, I am too dumb, '* and “ I happened to be 
a little smarter than the rest." Mannquestionsthe 
emotional impact of ability grouping on children. Sim- 
ilar evidence was found by Axline (1) and Bell (2) 
while working with retarded readers. Both investi- 
gatorsfoundthattheconcept-of-selí as a success- 
ful person and reader was very poor for these chil- 
dren. When the children were helped to strengthen 
their concept—of—self as a successful individual, 
reading improved. 


Teachers group children by ability and expectthe 
children in the top reading groups to read better than 
the children in the lower reading groups. These ex- 
pectations combine with the attitudes of liking, be- 
cause the teachers, as well as the children, willtend 
to like successful better than unsuccessful children. 
The teacher's expectations and liking for the chil- 
dren in the top reading groups probably influences 
the rewards given, the development of intragrouplik- 
ing, the reciprocal success-reward factor, the at- 
mosphere and the interaction, all of which influence 
the development of liking. 


In a sense, the teacher stacks the cards against 
the children who are in the slower range of develop- 
ment, because once the teacher looks at her class in 
terms of top, middle, and lower ability, she devel- 
ops an expectancy about the children that influences 
the development of reading and the development of 


interpersonal liking. 


IMPLICATIONS 


Since high group cohesiveness has been shownto 
facilitate learning, and low group or lack of group 
cohesiveness has been shown to be related to de- 
creased facilitation of learning or to inhibit learning, 
the direct implications of this study are: (1) ability 
grouping in reading facilitates the top reading 
groups; (2) ability grouping in reading inhibits th e 
learning of reading in the lower reading groups, and 
(3) ability grouping in reading is either slightly fa- 
cilitating or slightly inhibiting to the learning of read- 
ing in middle reading groups. 


The antecedents of liking, success-reward, 
atmosphere, interaction, and status influence 
reading development. The antecedents of liking 
and group cohesiveness combined with the 
teacher's expectancy toward the children in the 
top and lower ability groups also may affect 
reading development. If further research shows 


these statements to be reliable then the уа lu e 
of ability grouping for reading instruction 
is open to question. 
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FOOTNOTES 


1. This paper is based upon the first author's doc- 
toral dissertation at the University of 
Kentucky. 


2. The three choices for each child were not indepen- 
dent as once a child made a selection the select- 
ed child could not be chosenagain. The bias 
created by this lack of independence for within 
subjects acts to slightly de-emphasize the great- 
er than expected choosing of individuals in a 
particular group. 
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ABSTRACT 


d and questioned from a number of different angles. 


Suggestions are given as to how data might be transformed to map the criterion data. The crucial point is that 
in'the behavioral sciences we never see the construct, only the criterion measures that we assume are good 


measures of the construct. Therefore,, we can never verify that the criterion is an interval measure of the con- 
struct. Researchers should spend time in finding a meaningful criterion and then using nonlinear transforma- 
tions to isomorphically map the independent predictor variable( s) onto the dependent criterion variable. The 


last example demonstrates that a given measure cannot be considered to be either intervalor non- interval. 
Whether a variable isomorphically maps à criterion is a function of the theoretical system from which the re- 
searcher is working. Therefore it is inappropriate to refer to a variable as innately "interval." 


MOST INFLUENTIAL statistical tests em- information regarding the rectilinearity of the num- 
phasize the necessity of assuming that the criterion bers to the construct interest. English (scale value 
is an interval scale for most statistical tests (е. g-, 5) is of more interest than Mathematics ( scale value 
Fandt). This assumption is said to be necessary 4), but can we say English is one unit more interest- 
because the tests add numbers, an arithmetic pro- ing than Mathematics? Probably not. Likewise we 
cess, which supposedly yields nonsense whena scale cannot say Mathematics (scale value 4) plus Music 
is not interval. The purpose of this paper is to ex- (scale value 1) combined are as interesting as En- 
amine the notion of intervalness andto suggest trans- glish (scale value 5). Essentially, when using Or- 


formation activities which might result in isomor- dinal scales we go from most to least (or least to 
most) but we do not know the magnitude on the con- 


phic corresponding scales. 

A scale is said to be interval when the numbers вое betwen айас ра Ка dius ка 
relate monotonically and rectilinearly to aconstruct. А, quite a lot (the Ж kerena in interest might bé 
For example, if we have three pieces of rope—4 feet small) and really detest Philosophy and Music. Ад 
long, 2 feet long, and 6 feet long— we can Say that ditivity of units is not meaningful with the ordinal 
the rope difference between 2 and 4 is the same aS scale, and it is nonsense to say an object rank of 5 
between 4 and 6; furthermore, the arithmetic: ayar- is five times greater than an object rankof1. Please 
age of the three lengths of rope 15 4 feet len dur note, the conventional definition of intervalness ofa 
2+4+6). mall of these operations the results scale depends upon the scale’s monotonic and recti- 
make sense in the * real" world. linear relation to some construct or object (1). Note 

In contrast, the numbers in an ordinal scale are that if there is a perfect linear relationship between 
monotonic but not necessarily linear in relationto the two variables, then the two variables are measuring 
construct. We may ask a student to rank in order, a construct in the same Way. An isomorphism exists 
on the basis of interest, five college courses. The between the two scales, and we refer to this as the 
student may order them in the following manner: two variables being intervally related. Thevariables 
English, Mathematics, Art, Philosophy, and Music. may or may not be interval measures of the con- 
We can assign numbers to the order: 5, 4, 3, 2, and structs. Indeed it is not important if the variables 

are interval measures of the construct, as long as 


bers represent monotonicity (in this 


rder), but we really have no , they are meaningful and useful measures of behavior, 


1. These numbe 
case in descending o 
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sychological research, most of our criterion 
meg ада predictor ( independent) Scales are 
assumed to be measuring some underlying construct, 
but we seldom (never?) know the “real nature of 
the construct. For example, we have invented the 
construct intelligence and have developed (IQ) tests 
to measure it, but we do not have a scale wecan roll 
out of the person’s head to determine how monotoni- 
cally rectilinear the measurement scale (IQ units ) 
conforms to the construct intelligence. Intelligence 
tests, nevertheless, do a fair job in predicting some 
behaviors we call intelligent (e. g., school achieve- 
ment). In view of the fact that some tests more or 
less predict behaviors they theoretically should, we 
might desire to investigate the relationship ofatest 
(given numerical values) with the numbers assigned 
to the criterion rather than the construct. To illus- 
trate this point let us examine the ficticious data pro- 
vided in Table 1. 


TABLE 1 


PROBLEM-SOLVING SCORES (Y) AND 
INTELLIGENCE TEST SCORES (X)* 


Individual 


w 
e 
ROMAN 


1 
2 
3 
4 64 
5 Ш 
6 144 1 


* Two sets of scores by six individuals. X= values 
the individual is assigned as a result of completing 
an intelligence test. Ү = the values assigned to the 
individuals based on his performances on job problem- 


solving. 


The X variable is intelligence test scores andthe 
Y variable problem solving performance scores. In 


FIGURE 1 


D BY 
LIGENT PERFORMANCE AS MEASURE 
NUMBER OF PROBLEM SOLUTIONS ON A JOB 


Ф 


3 & 5 6 7 8 9 10 1L 12 13 
IQ 


i it between X 
* Line А represents the line of best fit en X 
au Y Еу танй squares solution Y = a + bX; Y= 
-37.34 + 14X. Line B represents the observed re- 


lationship between X and Y. 


relation to Y scores, X is not an interval scale be- 
cause a 2-unit increase on X from 2 to 4 yields a 12- 
unit increase on Y, whereas a 2-unit increase on X 
Írom 10 to 12 yields a 44-unit increase on Y. How- 
ever, the X scale is anordinalscale in relationto Y 
because monotonicity exists. One canuse the least 
Squares solution to solve for a and b ina regression 
equation Y - a « bX to provide a line of be st fit, 
which represents the degree of rectilinearity between 
X and Y. 


The straight line of best fit does a fair jobof rep- 
resenting the observed scores but tends to underes- 
timate Y values of the extreme X Scores andoveres- 
timate those mid-range scores. Table 2 shows the 
observed scores on Y, the решен cores (Y), 
and a set of difference scores (Y -Y) 


TABLE 2 


OBSERVED SCORES ON Y, PREDICTED SCORES 
(Y) USING Y= -37.34 + 14X AND DIFFERENCE 
SCORES (Y - Y) 


22 

i Y Y-Y 

А 59425 13.34 
16 18.66 eee 
36 46.66 Ер 
65 74.66 are ee 
100 102.66 ЭБ 
n 130.66 13.34 


If we square each element in C -Y) and sum 
the squares we have the familiar error sums of 
Squares within (ESSy). This value is usually attri- 
buted to errors of measurement and lack of perfect 
association of the two measures to the underlying 
construct; but we also know that the assumption of 
а linear relationship in this case has been violated, 
therefore, some of the observed error must be due 
to adding and subtracting nonsense. We Shall always 
be saddled with errors of measurement, but can we 
reduce the ESS by more closely approximating an 
interval relation between X and Y? 


X? scores. ; X, 


It is obvious that the 


; transformatio i 
isomorphism of Y and п results inan 


X? scores. This relationship 
TABLE 3 


Y SCORES, x SCORES, AND x SCORES SQUARED 


X x2 
4 
16 2 4 
36 E 16 
144 0 100 
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сап be expressed in a regression equation Y = 0 +1 
(X?). Using this equation all predicted scores are 
numerically equivalent to the observed scores and 
thus ESS, = 0. There is a linear relationship be- 
tween Y and X? (the transformed scores). In this 
case the original ESSy were attributed solely to the 
fact that the two sets of scores were not linearly re- 
lated to each other. In reality we probably can'tex- 
pect to observe such dramatic results due to trans- 
formations yet let us consider the possibilities that 
this ficticious bit of data provides. 


When we measure and ascribe numbers to perfor- 
mance we often add the number of correct responses 
and use this value as the level of performance. Ifthree 
or four of the items are extremely potent as they re- 
late to the criterion, they will contribute to making 
a scale non-interval to the criterion. For example, 
if items 1-30 are equally easy and 31-34 are more 
difficult, then the difference between scores of 28-29 
may be equivalent to the difference between 29-30 as 
they relate to the criterion (assuming these subjects 
missed items 31-34). However, the difference be- 
tween 29-30 and 30-31 as they relate to the criterion 
will not be equal. The added unit increase (30 to 31) 
may yield a much larger increase ofthe Y scale than 
the unit increase from 29 to 30. Indeed, often we 
sum scores and really are not sure what the values 
mean to other scales or other observed behavior. 


Since the advent of the computer, most research 
investigators seldom see a scatter plot of their data 
and often miss systematic departures from a recti- 
linear relationship among the data. The consequenc- 
es may lead to an artificially large ESS due in part 
to departure from intervalness. Transformations 
may reduce this error. 


In the preceding discussion as well as in the fol- 
lowing illustrations, the reader should be aware of 
the focus of the numerical relationship. Itis between 
the criterion and predictor( s). We may never know 
the quantitative nature of the underlying constructs, 
but if the criterion is meaningful in the context of 
one’s theory, then a search for rescaling of the pre- 
dictor(s) (and maybe even the criterion) to produce 
an approximate interval relationship seems legiti- 
mate, as long as the investigator realizes that the 
best transformation for one sample may not be gen- 
eralizable to other individuals from the population. 
It is thus incumbent upon the investigatorto replicate 
the transformation in successive samples from the 
population he is concerned about. 


EXAMPLES OF NONLINEAR TRANSFORMATIONS 


The following examples where a nonlinear transfor- 
mationreduces error are provided to show a number of 
possible uses both with single and multiple predictors. 


examples we shall assume 


In all of the following 
aningful measured behavior. 


that the criterion is a me 


Example 1 (Figure 2) 


Figure 2 represents a case where there is an in- 
terval relationship between the predictor andthe cri- 
terion at the lower end of the predictor scale, but 
the high end of the scale does not meet the 
requirements of intervalness. А simple transforma- 
t end of the scale will create an interval X 


tion on that à ate а val 
scale in relation to the particular criterion. Itisin- 


FIGURE 2 
А SITUATION WHERE X IS AN INTERVAL 


MEASURE OF THE CRITERION ALONG PART 
OF THE SCALE 


123456789101112 


Criterion Measure 


х 


teresting to note that the linear transformation: X' = 
2.5+.5X for those X values above 5 will suffice. Since 
the X values below the value of 5 do not need to be trans- 
formed, then withrespectto the entire X scale, we say 
that we have made a nonlinear transformation. 


FIGURE 3 


A SITUATION WHERE TWO TRANSFORMATIONS 
ARE NEEDED 


HPN шо PUn сум 00 


Criterion Measure 


12345678 
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xample 2 ( Figure 3 


In Figure 3, two transformations of the X variable 
will be necessary to create an interval relationship be- 
tween X andthe criterion. One transformation is nec- 
essary on X scores between 0 and 3, andanother onX 
scores above 3. Во оѓ these transformations will be 
linear transformations (withinthe range of interest). 
The only problem is to find the weighting coefficients 
which willdothetrick. Some transformations are easy 
enough to spot without much calculation. Others are 
quite intricate and demand a very precise calculation. 
The multiple linear regression procedure can be very 
useful in this rescaling process (3). 

An easy method to empirically find the necessary 
transformation is to find the weighting coefficients 
associated with the straight line. Let us refer back 
to the problem represented in Figure 2 to see how 
this can be done. We need to have two vectors, one 
which allows the Y intercept to manifest itself, and 
one which allows the slope of the line to manifest it- 
self. The regression model which would accomplish 
this purpose is as follows: 


X. = арй + ау X +E 
5.5: 1 6 о 
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Since these weighting coefficients perfectly map the 
X values onto the Y values (all values іп Еј are equal 
to zero), we know that there is an inte rval relation- 
ship between X and Y. Now if we desire to make X 
an interval scale with respect to the criterion, all 
we have to do is simply apply these weighting соеї- 
ficients to the X values and produce new X scores 
(х'): 


Х' =.5Х +2.5 


Note that this is the same transformation that we рег- 
formed for the data in Figure 2. 


Now the problem in Figure 3 is a little more dif- 
ficult since two transformations are necessary. The 
logic and procedure that is presented above is still 
applicable; all that is necessary is to construct vec- 
tors which will allow the slopes and Y intercepts of 
the two lines to manifest themselves. 


Y. = а, Uy + аҙ x, + аҙ Uy + a, Хз + Ez 


-ч-ә-Һ-ооо 
+ 

Шогсоосо 
+ 

ooooooo 


Where: 


U, =1if the score on X is less than 3, zero 
otherwise; 


Хо = score on X if the value is less than 3, zero 
otherwise; 


Ug = 1 if the score on X is З ог greater, zero 
otherwise; 


Хз = score on Xif the value is 3or greater, 
zero otherwise; 


> 

Eg = error in prediction Y4 - Үү; and а, ад, 
a3 and a4 are partial regression weights 
calculated to minimize the error in pre- 
diction. 


We could have solved both lines separately, but 
it seems that a simultaneous solution is more ele- 
gant, especially if a number of these lines must be 
found. Note that the X value of 3 is used in the de- 
termination of only one of the lines, although which 
line is used is entirely arbitrary as the data conform 


to both lines. The two transformations that arenec- 
essary are: 

ifX-3, X = 0 «2X 

if X= 3, X' =51/4+1/4X 


We will then have an interval mapping of X' on Y. 


Some readers may question the appropriateness 
of such а transformation. Indeed we are uniquely 
fitting lines to a sample of data. Whether or not this 
is “the” transformation can to some extent be deter- 
mined by randomly drawing another sample of data 
from the same population and applying the same 
transformation to thatsecondset. If the correlation 


between the transformed X scores and the criterion 
is close to 1.0, then the transformation can be con- 
sidered appropriate and X' is an interval scale with 
respect to the criterion under consideration. 


Example 3 ( Table 4) 


The purpose of this example is to establish the 
non-necessity of the traditional notionof the assump- 
tion of interval scales. It appears to us that, with 
respect to the several goals of research— predict- 
ability, parsimony, replicability, validity general- 
ization, control, and understanding—the assumption 
of an interval scale is really only necessary for the 
last named goal. 


We shall now investigate a situation wherein both 
Scales are clearly only ordinal measures. Scales 
H and I in Table 4 are both ordinal with respect to 
the underlying construct of length. For example, the 
1-inch difference between lines A and B results in a 
difference of 12 units on the H scale, whereas the 
1-inch difference between lines B and C results in 
a difference of 20 units on the H scale. Neverthe- 
less, the correlation between scales H and I will 
yield an r value of 1.0. The transformation neces- 
sary to go from H to Iis: I-2xH. This perfect 
correlation tells us that there is a 1-to-1 monotonic 
and rectilinear relationship between scales H and I. 
The point is that the correlation coefficient can be 
computed and the validity ascertained and tested for 
significance while one is fully aware of the fact that 
neither of the two variables is an interval measure. 
As a result of the perfect correlation between these 
two ordinal variables, one has met some of the goals 
of research, particularly those of predictability and 
parsimony. The goals of replicability, validity gen- 
eralization, and control have not been investigated, 
although the ordinal nature of the scales does not 
obviate the attainment of these goals. 


One would be hard put to argue that the use of or- 
dinal scales can assist one in reaching the goal of 
understanding, for as Hays points out: 


Although the numbers standing for ordinal 
measurements may be manipulated by arith- 
metic, the answer cannot necessarily be in- 
terpreted аз а statement about the true mag- 
nitudes of objects, nor about the true amounts 
of some property. (2:71) 


TABLE 4 


SIX LINES AND THREE SYSTE 
THESE LEE MS OF MEASURING 


c -—— EE 


Line Length Usin 
g Scale H Scale I 
Inch-Scale Units еще 
— ----- 


A 1 Р к 
6 а 16 32 
р 1 36 12 
Е 4 64 128 
Е 9 100 200 

6 144 288 


p. Distance 
d, 
i 
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Example 4 ( Figure 4 


We have one further way to attack the traditional 
\ notion of intervalness and that involves the notion that 
a scale is either interval or it is ordinal. We are 
aware that most researchers would argue that there 
are degrees of intervalness in that some scales come 
closer to being interval measures than do others. 
Figure 4 (à and b) presents a situation wherein a 
scale is interval in one situation ( Figure 4a) and 
quite ardinal in-another (Figure 4b). It is interest- 
ing to note that the criterion and the way it is mea- 
sured are exactly the same inbothsituations. There 


FIGURE 4 


A STATE OF AFFAIRS WHEREIN TWO SCALES 
ARE RECTILINEARLY RELATED TO ONE у 
"ANOTHER AND ANOTHER STATE OF AFFAIRS 
WHEREIN THE SAME SCALES ARE NOT 
RECTILINEARLY RELATED TO ONE ANOTHER, 
UNTIL A NONLINEAR. TRANSFORMATION IS 
, PERFORMED 


Time 


4a Observation of variables on the surface of the 
earth. 
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Time 


4b Observation of variables in vacuum moving to- 
wards the earth. 


a 
Distance 
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Time Squared 
4c Observations of variables in vacuum moving to- 
wards the earth, after nonlinear transformation 
on time. 


Square 
Root. 
‚ of 
Distance 
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t 

Time 
d Observations of variables in vacuum moving to- 
wards the earth, after nonlinear transformation 


on distance. 
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is one minor difference and that is the system within 
which the tariables are being related. 


We usually think of distance (as measured in 
feet) and time (as measured in seconds) as interval 
measures. Indeed they are when we observe objects 
traveling along the surface of the earth ( Figure 4a). 
A car traveling at a constant speed moves a very pre- 
dictable distance in a given number of seconds, and 
for any given second, the car will travel a certain 
constant distance. Now look at Figure 4b, where we 
are concerned about studying the variables of distance 
and time in another instance—that of moving towards 
the earth. Now we do not observe an interval rela- 
tionship between distance and time, although there 
appears to be a systematic relationship between the 
two variables. Which of the two variablesisnolong- 
erinterval? This seems to be a nonsensical ques- 
tion as it really doesn't matter, for we can transform 
t to perfectly map d, (d= t? as in Figure 4с), or we 
can transform d to perfectly map t, (t= Vd, as in 
Figure 4d). What is important is that, as a result 
of the interval relationship between, say, t? and d, 
we can predict d if we know t; we can get this predic- 
tion in a relatively parsimonious fashion; we can 
check into the replicability of the relationship ( indeed 
falling bodies in a vacuum will always follow the re- 
lationship); and we can also verify the validity gen- 
eralization of the relationship (no matter what the 
mass, nor what the weight, all bodies willfollowthis 
relationship in a vacuum). In order for predictionto 
be accurate at various altitudes and on other plane- 
tary bodies, the concept of gravity must be enter- 
tained, (i.e., d= 1/2 61). Furthermore, once we 
ascertain the above, we control the distance that an 
object will fall, and likewise we can co ntrol the 
amount of time we allow it to fall. We still may not 
understand why all this can be done, but at least we 
have made quite a few inroads into the phenomenon 
under consideration. And we have made all these 
discoveries with at least one ordinal Scale! 


The preceding comments pertain not only to cor- 
relational procedures, but also to analysis of vari- 
ance procedures and any other procedure which is 
based upon the least squares solution. When the 
analysis of variance model is being used, the mean- 
ingful transformation will of necessity be on the сгі- 
terion, unless there is some justification for ordi- 
nality of the independent variable. 


The examples provided above have dealt with re- 
scaling where monotonicity exists; however, there 
is no reason to assume that the two or more scales 
to be investigated must be monotonically related 
The literature relating test performance to anxiety 
measures often reveals anon-monotonic, non- 
rectilinear relation between the two measures. The 
transformation procedures outlined above can apply 
to the non-monotonic case as well, indeed the (^ 
shaped function usually observed can easily be re- 
flected by the equation: =a+byX + b9X^ where 
the X variable represents anxiety and X^ is the 
square of the elements in vector X. Such an equa- 
tion will yield a positive value for weight by and a 
negative value for b2. 


SUMMARY 


Firstly, when one has what is known to be an or- 
dinal scale, one may still have valuable information. 
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THE TEACHING OF YOUNG CHILDREN: SOME APPLICATION OF PIAGET'S LEARNING THEORY 


Brearley, Molly, Editor (New York: Schrochen Books, Inc., 1970), 191 pp. 


A prerequisite to reading this short treatise on the teaching of young children is the reader's understanding 
of (a) Piaget's learning theory and (b)the British Infant School. The former is the educational foundation andthe 
latter is the educational setting—yet, neither is given enough description, documentation, or explanation to be of 
real value to the uninformed reader. 


Seven educators at the Froebel Educational Institute in England have compiled this book “о more precisely 
determine the goals of education for young children and how these goals may best be met. ” Except for the intro- 
duction written by Molly Brearley, Principal of the Institute, no credit is given for individual chapters which deal 
with: science, art, literature, movement, mathematics, music, morality, Psychological standpoint, and teach- 
ers and children. Nor are professional biographies or even areas of professional interest listed. The reader is 
left with only a vague notion of the expertise underlying the presented points of view. This becomes of special 
importance when one looks beyond Piaget to the authors’ “ applications " of Piaget's learning theory. It is here 
that the book should have its greatest impact—yet, it is here that it has its greatest Shortcoming. 


The strong point of the book is Piaget and the authors’ reaffirmation of the best we know about children and how 
they learn. (Many other experts on children and learning are referred to as frequently as Piaget. This raises the 


00 often teacher perceived role of 
cussion of the stages leading to expres- 
ction which aims at, but doesn’t direct- 


The chapter on science emphasizes scientific thinking rather than the t 
**pounding in’’the content of science. In the art chapter there is a good dis 
sion of representations through drawing, printing, modeling, and constru 
ly hit, the need for art as expression rather than a daily art product. 


Under literature, the wide variety of literary forms are examined for the ways th " 
highly personal experiences as well as serving to extend language двуай тер d ey can lay foundations for 


icati i Movi i i 
of expressing and communicating and this chapter looks at the patterns Children can silt acho eee 
em- 


selves as unique individuals. 


The main function of the school in relation to mathematical development is describ: 
the environment and helping to bring about a match between their personal psychologi 
the logical structures of mathematical knowledge. Ordering, relating, and measurin; 


ed as providing children with 
cal learning structures and 
Б are emphasized, 


Hearing and making sound get equal attention in the chapter on Music which stres 


which allows children to discover and experiment for themselves with musical thing "es the need for an environment 


5 and processes 
‘ ality: Values and Reasons ” deals with the concept of self and its relations wi 
piat S with others, laying the foundation 


(continued on page 67) 
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INTERACTIONS OF ATTITUDES AND ASSOCIATIVE 


INTERFERENCE IN CLASSROOM LEARN NG 


WILLIAM L. MIKULAS 
University of West Florida 


ABSTRACT 


Ss were tested on their kn 


lecture on the relevant topic and ret 


owledge, attitude, and prior experience toward one of two topics. Then they were 


ested after 2 days and 6 weeks, with half ofthe 


assigned a reading or given a 
cond test. Analyses determined to what extent the following variables affected changes 


Ss given feedback after the se 
in attitudes and factual knowle: 


THE PROMINENT explanation of human for- 
getting of verbal material is associative interference 
theory (4) in which the major retention loss is due 
to competition from alternative responses at thetime 
of recall, Most of the relevant studies, however, 
have utilized meaningless material such as nonsense 
syllables. Recent studies involving learning mean- 
ingful material (1,2, 7) have failed to demonstrate 
simple interference effects, although in some cases 
meaningfulness may have been confounded withd e - 
gree of original learning (8). 


Classroom learning isn’t as affectless as most 
verbal learning situations. Rather, the students often 
have very definite attitudes toward the material tobe 
learned and the learning involves attitude changes as 
well as acquisition of content. Var iables іп the class- 
room that affect changes in student attitudes include 
the perceived credibility of the instructor, the per- 
sonal involvement of the student in his position, the 
difference between the student’s position and the po- 
sition the instructor advocates, and how one-sided 
the instructor’s message is perceived (5,10). 


An important educational objective is to provide 
the student with feedback concerning the accuracy and 
rate of his progress. Feedback may have any combi- 
nation of the following effects: it can strengthen the 
student’s learning as when he finds out he was right, 
it can result in motivational changes such as in the 
goals the student sets (6), itcan change the direction the 
student is moving in his learning, and it can bean ew 
learning experience ог rehearsalof previous learning. 


dge: topic, mode of instruction, feedback, prior attitude, prior factual knowledge, 


prior experience, and evaluation of the lecture or readings. 


In the present study college students learned, by 
lecture or readings, about a topic on which they had 
definite attitudes and ideas. By assessing their at- 
titudes and factual knowledge before the new learning, 
immediately after, and 6 weeks after, changes in at- 
titudes and factual beliefs were investigated and com- 
pared with predictions from models of attitude change 
and verballearning. Also, half of the students were 
provided feedback immediately after the secondtest. 
Most important, it was possible to determine inter- 
actions between the variables of verbal learning, at- 
titude change, and feedback. 


METHOD 


Subjects 


The Ss were the students of ten sections of an in- 
troductory psychology course with about twenty-five 
students per section. Eightof the sections com- 
prised the eight possible experimental groups result- 
ing from all combinations of the following three vari- 
ables: Topic (hypnosis or therapy), Mode of learning 
(lecture or readings), and Feedback (present or ab- 
sent). The other two sections comprised two control 
groups. As much counterbalancing as possible was 
done relative to the time the class met, the instruc- 
tor’s teaching method, the sex of the instructor, and 
course content, 


Sequence of Events 
АП experimental Ss first took a pretest (test 1) on ei- 
ther hypnosis or therapy. Two days later they received 
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new learning inthe appropriate subject area by oneof 
two modes of instruction, lecture or readings. Two days 
and six weeks after the new learning the Ss were given 
retests, tests 2and 3, over the material of test 1. Half 
the Ss received feedback over the correct answers to 
test 2 immediately after thattest. 


The two control sections had test 1 on one of the 
two topics, a lecture on the opposite topic, andtests 
2 and 3 on the same topic as test 1. 


INDEPENDENT VARIABLES 


Topic. The content of the new learning was ei- 
ther hypnosis or therapy. The new learning onhyp- 
nosis, based on Hypnosis in Perspective (9), dis- 
cussed basic phenomena of hypnosis and stressed the 
inherent dangers of being hypnotized, particularly 
by an unskilled hypnotist. ‘“ New Ways іп Psycho- 
therapy’’ (3), was the basis for the new learning on 
therapy which argued for ‘‘ behavior therapy" as 
opposed to psychoanalysis. 


Mode. The mode of the new learning was either 
by readings or lecture with both covering the same 
material. In the analyses comparisons were made 
between reading groups and lecture groups (Mode 2 
analysis) or among reading groups, lecture groups, 
and groups that missed the new learning( Mode 3 
analysis). 


E assigned the readings and gave the lectures 
with the instructors of the classes telling Ss that the 
new learning was a course requirement. Ss were 
told before the new learning that the pretest empha- 
sized certain misconceptions and central ideas which 
they would be retested on in 2 days with a test sim- 
ilar to the first. 


Feedback. Following the new learning and test 
2 on the new learning, half the Ss receivedfeedback, 
a handout to read in class which identified the cor - 
rect answers to test 2. E then answered any ques- 
tions the Ss had. 


DEPENDENT VARIABLES 
Test 1, This pretest was composed of three parts: 


(1) One of two short situational questions to as- 
sess the Ss’ attitudes toward the particular 
topic: If the person you married became 
** mentally ill, " would you send him to psy- 
choanalysis ? Would you allow yourself to be 
hypnotized by a doctor or dentist for med- 
ical purposes, or by an entertainer in a night- 
club? 


(2) Ten “factual”? statements on which the Ss 
marked their degree of belief on 5- pointscales: 
sure it is true, think it is true, don't know, 
think it is false, sure it is false. 


i i i ing Ss to list all 
3) A biographical question asking 
: A eins psychology courses they had had 
plus any experience or training related to 
the previous questions. 


] items for the tests were pretested on non- 


Factua. advanced psychology course. The 


psychologists and an 


final items were chosen such that there would prob- 
ably not be any statements for whichtheSs wouldbe in 
general accord withthe position of the new learning. 


All three tests were prepared in two forms, Aand 
B, which differed only in the wording of some of the 
statements, the situation question, and the order of 
the statements. The two forms were constructed to 
be equivalent, an aim suggested by the results of the 
present study, For although the tests were found to 
be highly reliable in terms of the effects of the main 
variables, the form of test 1 did not have a signifi- 
cant effect on either the attitude or factual scores of 
test 1. And the particular sequence of test forms be- 
tween tests 1, 2, and 3 did not affect changes in atti- 
tude or factual scores between the tests. 


On tests 1 and 2 a random half of each class was 
given form A and the other half form B. On test 3 
each S had the opposite form from his test 2, or the 
opposite of test 1 if he missed test 2. 


Test 2. Test 2 began with the same attitude ques- 
tion and ten factual statements as test 1. In addition 
Ss evaluated on 5-point scales (1- very good, 2-good, 
3=neutral, 4=bad, 5 = very bad) the (a) quality, (b) 
credibility, and (c) bias (one-sidedness) of theread- |” 
ings or lecture. Or the S indicated that he wasunable 
to do the readings or missed the lectures. To increase “Ы 
the validity of theSs’ answers about missing the new 
learning the E strongly emphasized that it didn’t mat- 
ter if they missed it but it did matter that they answer Wi 
truthfully, and that their instructors would never see | 
their answers. 


For each topic the bias evaluations weregrouped | 
together in three approximately equal groups to form 
the variable Bias classification, For therapy the up- 
per third contained scores 1-2; middle third, score 
3; and bottom third, scores 4-5. For hypnosis the 
upper third contained score 1; middle third, " 
Score 2; and bottom third, scores 3-5. 


Test 3. Test 3 consisted of the attitude question 
and the ten factual statements. ш 


4 
Scoring of Tests 
Attitude scores. The following gives the scale val- 
ues for the different responses to the attitude qu eS 
tion on hypnosis, 
Score 1: S would allow himself to be hypnotized 
by anyone, | 
, ! 
t 
Score 2: S wouldallow anyoneto hypnotize him pe A 
only with certain restrictions. 
score 3: S would only allow a ‘qualified?’ person, 
(е. g., doctor or dentist) to hypnotize В К 
to 
score 4: S would only allow a qualified person 
hypnotize him under certain conditio ћ 
c ы 
Score 5: S would never allow himself to be hyp? 7S 
tized. - S 
«{ 
o the | 


A similar scale was used to assign scores t 
answers to the therapy attitude question: 
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score 1: S would use psychoanalysis; 


score 2: S would use psychoanalysis but with 
some restrictions; 


score 3: S would use psychoanalysis but only in 
conjunction with other procedures; 


score 4: S would use psychoanalysis after trying 
other approaches; 


score 5: S would not use psychoanalysis, 


The attitude scores were divided by topic into three 
approximately equal groups to form the variable At- 
titude classification. For therapy the top third con- 
tained score 1; middle third, score 2;and bottom 
third, scores 3-5. For hypnosis the top third contain- 
ed scores 1-2; middlethird, score 3;and bottom 
third, scores 4-5. 


Factual scores, In addition to scores given the 
statements by Ss, each response was marked “‘ right” 
or ** wrong’? relative to the new learning. Thus, if 
S marked a statement 2 (think it is right) whilethe 
lecturer said it was false, this item would be mark- 
ed wrong. Scores of 3 (don't know) were classified 
as ‘right’? since the emphasis in this study was оп 
items about which the Ss had misconceptions. In the 
later analyses of factual changes classifying ascore 
of 3 as right is logically equivalent to notassigning 
it any label at all. 


The variable Factual classification was construct- 
ed by dividingthe factual scores, separated by topic, 
into three approximately equal groups. For therapy 
the top third contained scores 0-3; middle third, 
scores 4-5; and bottom third, scores 6-10. For hyp- 
nosis the top third contained scores 0-5; middle 
third, scores 6-8; and bottom third, scores 9-10. 


Experience scores. In response to the question 
about prior experience in the area, S was scored 1 
if he had no prior experience; 2 if he had had some 
general courses related to the topic or had done some 
reading in the area;and 3 if he had had personal ex- 
periences inthearea, courses specifically dealing with 
thetopic, or some personal involvement inthe issues. 


RESULTS 
А series of fifty-eight, independent, two-way an- 
alyses of variance were done using thefollowing clas- 
sification variables: 
Topic: ^ hypnosis or therapy; ы 


Mode 2: lecture or reading; 


Mode 3: lectureorreadingor missedthe new 
learning; 

Feedback: received feedback or did not; 
_ Feedback 

T1 form: form A or form B of test 1; 
Attitude classification: the third of all test 1 atti- 
tude scores on a given topic inwhicha particular at- 
titude scorefell, i.e., top, middle, or bottom third; 


Factual classification: the thirdof all test 1factual 
Factual сазын 


Scores on a given topic in which a particular fac- 
tual score fell; 


Experience: the score of the answer to the ques- 
tion about prior experience in the area; 


Bias classification: the third of all evaluations of 
bias for a given topic in which a particular score 
fell; 


Test sequence: the sequence of the forms of the3 
tests used for a particular S, e.g., ABAand BBA; 


Section Number: the particular sectionS was in. 


Throughout the results main effects and interac- 
tions listed as not significant are those that were not 
statistically significant at less than the .05 level 
Because of the number of statistical tests computed, 
it is possible that some of the apparently significant 
findings occurred by chance. However, the consisten- 
cy of the results makes this improbable, at leastfor 
the main effects. 


Test 1 Results: Attitude Scores and Factual Scores 


Attitude scores. An analysis of variance on the 
scores of the answers to the attitude question showed 
Topic to have had a significant effect (F=38, 60; df = 
1,144; p= . 0001) with the scores оп the hypnosis 
tests (mean =3. 02) higher than those on the therapy 
tests (mean=1.76). That is, Ss in the Hypnosis 
groups were hesitant about being hypnotized, but 55 
іп the Therapy groups were fairly sure they would 
send their spouses to psychoanalysis. 


The following did not have a significant effect on the 
attitude scores: 


ТІ form 

Section number X T1 form 
Experience 

Topic X Experience 


Section number was by necessity significant since 
Topic was significant and each section had only one 


topic. 


Factual scores. The factual scores are the number 
of factual items out of ten marked ‘‘correct’’ on the 
basis of the Topic material. As was true of the atti- 
tude scores, Topic had a significant effect on the fac- 
tual scores (F= 73. 79; df=1, 144; p^ . 0001) with the 
mean score on hypnosis tests(6. 93) higher than the 
mean for the therapy tests (4.19). That is, Ss incor- 
rectly marked more statements about therapy than 
about hypnosis, although it is improbable the tests 
were of equal difficulty, As with the attitude scores, 
the following did not significantly affect thefactual 


scores: 
T1 form 
Section Number X T1 form 


Experience 
Topic X Experience 


Test 2 Results: Evaluations of Quality, Credibility, 
and Bias 


On test 2 all Ss evaluated the quality, credibility, 
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and bias (one-sidedness ) of the new learning. Since 
most of the Ss marked the qualityand credibilityvery 
good or good, regardless of Topic or Mode,these vari- 
ables were not considered further. 


The bias evaluations showed more dispersion with 
Topic significantly affecting the scores (Ғ-72. 67; df 
=1, 95; p5 . 0001). The lectures and readings onther- 
apy were considered by the Ss to be more biased than 
those on hypnosis. The following did not significant- 
ly bias scores: 


Mode 2 

Topic X Mode 2 

Attitude classification 

Factual classification 

Attitude classification X Factual classification 
Experience 

Topic X Experience 


Changes Between Tests 1,2, and 3 


Factual changes. For each factual item the change 
in its score relative to the new learning was com- 


puted between test 1 and test 2 and between test 1 and 
test 3. Changes between test 2 and test 3 were not 
computed directly as this would violate the statisti- 
cal assumption of independence for the analyses. If 
an S marked an item with a 2 (think it is right) on 
test 1 and marked the corresponding item with a 5 
(sure it is false) on test 2, this would be a change of 
three units. If the new learning said the item was 
false it would be a change of +3; and if the new learn- 
ing said it was true, it would be a change of -3, 


Table 1 lists those variables which significantly 
affected factual changes between test 1 and test 2 ж 
while Table 2 lists the changes between tests 1 and 3. 


The order of the attributes of the corresponding 
variables in Tables 1 and 2 is the same. For example, 
Topic has two attributes, therapy and hypnosis, and 
there was a greater factual change in the Therapy 
groups than the Hypnosis groups, both between test 
1 and test 2 and between test 1 and test 3. Lecture 
produceda greater changethan readings (Mode 2 anal- 
ysis) and both of these produced a greater changethan 
no new learning (Mode 3 analysis). Ss with a medi- 
um amount of experience showed the greatest change, 
Ss with the most experience changed least, and Ss 
with no experience were in between. The lower the 
S's factual score on test 1 (the fewer the number of 


TABLE 1 


VARIABLES AFFECTING FACTUAL CHANGES BE- 
TWEEN TESTS 1 AND 2 


Variable df F p= 
Topic 1, 97 5.45 .025 
Mode 3 2, 124 26.16 . 0001 
Mode 2 1, 97 9. 74 . 005 
Experience 2, 92 3.41 .05 
Factual classification 2, 124 5.09 .01 


TABLE 2 


VARIABLES AFFECTING FACTUAL CHANGES BE- 
TWEEN TEST 1 AND TEST 3 


Variable аг F p= 
Topic 1, 84 11.75 . 001 
Mode 3 # 195 24.11.0001 
Mode 2 1, 84 9.97 -005 
Ехрегіепсе 2, 68 3.20 .05 
Factual classification 2, 109 8.94 . 001 
Feedback 1, 88 8. 83 . 005 
Bias classification 2, 68 4. 43 .025 


correct responses), the greater the change. Those 
groups that received feedback after test 2 showed а 
greater changeontest 3 than groups without feedback. 
Although Bias classification did not significantly affect 
factual changes betweentests 1 and 2, ithad a signifi- 
cant effect on the factual changes between tests 1 and 
3. Ss who judged the new learning as very biased 
showed the greatest factual change on test 3, while Ss 


who judged it as the least biased showed the least change 


The following variables and interactions did not 
have a significant effect on the factual changes betwee” 
test 1 and test 2: 


Bias classification 

Test sequence 

Attitude classification 

Topic X Mode 2 

Topic X Mode 3 

Topic X Factual classification 

Topic X Attitude classification 

Topic X Bias classification 

Topic X Experience 

Mode 2 X Attitude classification 

Mode 2 X Factual classification 

Mode 2 X Bias classification 

Experience X Bias classification 2 
Factual classification X Attitude classification 


"n i ya 
While these variables and interactions did not reel 
a significant effect onthe factual changes рей! 
test 1 and test 3: 


Test sequence 

Attitude classification 

Topic X Mode 2 

Topic X Mode 3 

Topic X Factual classification 

Topic X Attitude Classification 

Topic X Bias classification 

Topic X Experience 

Topic X Feedback 

Mode 2 X Feedback 

Mode 2 X Factual classification 

Mode 2 X Attitude classification 

Mode 2 X Bias classification 

Mode 3 X Feedback 

Experience X Bias classification P 
Factual classification X Attitude classification 


* 
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TABLE 3 


VARIABLES AFFECTING ATTITUDE CHANGES BE- 
TWEEN TESTS 1 AND 2 


Variable dt F p< 
Topic 1, 124 5.34 . 025 
Mode 3 2, 124 5.12 .01 
Attitude classification 2, 121 10. 20 . 0001 
Mode 2 X 

Bias classification 2, 94 4. 90 .01 
Attitude classification X 

Factual classification 4, 121 2.87 ‚05 


For two of the variables, Topic and Factual clas- 
sification, the level of significance increased between 
test 2 and test 3. For both variables the attribute 
producing the biggest change (therapy, low factual 
classification) showed an increase between test 2 
and test 3. While the attribute with the smallest 
change (hypnosis, high factual classification )show- 


ed a decrease. 


Attitude changes. For each S the amount of change 
in attitude, positive or negative, relative to the new 
learning (score 5) was computed between tests 1 
and 2 and between tests 1 and 3. The variables hav- 
ing a significant effect on changes in attitude are shown 


in Tables 3 and 4. 


As with factual changes, the order of theattributes 
of the variables significantly affecting attitude change 
is the same between tests 1 and 2 as between tests 1 
апа 3. Inthe case of Topic the Ss in the Therapy 
groups showed a greater attitude change ontest 2 
than Ss in the Hypnosis groups. The Therapy groups 
also showed a greater change on test 3. In the Mode 
3 analysis (lecture vs readings vs missed the new 
learning) the lecture groups showed more change 
than the readings groups which showed more change 
than the groups that missed the new learning. But 
sincethe Mode 2 analysis (lecture vs readings) was 
not significant, the effect of the Mode 3 analysis was 
due to the lecture and readings groups having signif- 
icantly more change than the group that missed the 
new learning. In the case of Attitude classification, 
the less the Ss’ original attitudes agreed with the new 
learning,the greater the attitude change. This rela- 
tion of attitude classification to attitude change paral- 


TABLE 4 


VARIABLES AFFECTING ATTITUDE CHANGES BE- 
TWEEN TEST 1 AND TEST 3 


Variable dt F ET 

Topic 1, 111 6.33  .025 

Mode 3 2, 117 3.39 .05 

Attitude classification 2, 107 8.00 .001 
2Х 

Mode ——^ 


Bias classification 2, 10 


| 93 


lels the effect of Factual classificati 
MET ation on factual 


The interaction between Mode 2 and Bias classi- 
fication revealed that for Ss who perceived the new 
learning as very biased or very unbiased the lecture 
was more effective for changing attitudes. While for 
Ss who perceived the new learning as being moderate- 
ly biased, the reading was more effective. The signif- 
icant interaction between Attitude classification and 
Factual classification on test 2 is somewhat more 
complex: In the case of low factual scores on test 1, 
the attitude change is greatest for Ss withextreme 
attitudes on test 1. In both cases this consisted of a 
drifting toward the mean, i.e., Ss with low attitude 
scores showed positive changes and those with high 
attitude scores showed negative changes. This may 
simply be due to simple regression towardthe mean, 
In the case of high factual scores on test 1, the atti- 
tude change was the greatest for Ss with intermediate 
attitude scores on test 1; the changes here being ina 
positive direction. Ss intermediate in their factual 
scores оп test 1 were intermediate in performance to 
the two cases above. 


The following variables and interactions did not 
have a significant effect on the attitude changes be - 
tween test 1 and test 2: 


Mode 2 

Experience 

Bias classification 

Test sequence 

Factual classification 

Topic X Mode 2 

Topic X Mode 3 

Topic X Factual classification 
Topic X Attitude classification 
Topic X Bias classification 
Mode 2 X Factual classification 
Mode 2 X Attitude classification 
Experience X Bias classification 


The following variables and interactions did not 
have a significant effect on the attitude changes be - 
tween test 1 and test 3: 


Mode 2 
Feedback 

Experience 

Bias classification 

Test sequence 

Factual classification 

Topic X Mode 2 

Topic X Mode 3 

Topic X Experience 

Topic X Feedback 

Topic X Factual classification 
Topic X Attitude classification 
Topic X Bias classification 
Mode 2 X Feedback 

Mode 2 X Factual classification 
Mode 2 X Attitude classification 
Mode 3 X Feedback 

Experience X Bias classification 
Attitude classification X Factual classification 


Control group results 


In addition to control data derived from analyses 
such as with the variable Mode 3, two groups were 
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specifically included to test for halo effects of ee B. 
For these groups, as described in the metho sec 
tion, the three tests were on one topic while the new 
learning was a lecture on the opposite topic. А series 
of t-tests determined if the amount of attitude or fac- 
tual changes between tests 1 and 2 or between tests 

1 and 3 was significantly different than zero. None 
of the tests yielded values significant at the. 05 level. 


DISCUSSION 


Evidence for test reliability comes from the con- 
sistency of the F-ratios and order of the attributes 
between tests 2 and 3, which were 6 weeks apart. For 
example, lecture produced a bigger factual change thar 
reading on test 2 with F=9. 74 and again lecture was 
more effective than reading on test3 with F = 9, 97. 
Inspection of Tables 1-4 shows this consistency to 
generally hold up for both factual and attitude scores. 


Some of the results can most simply be explain- 
ed that the greater the room for improvement the 
greater the change that will occur. The Therapy 
groups showed more factual change than the Hypno- 
sis groups because the Therapy groups had signifi- 
cantly lower factual scores ( number of ‘ correct" 
items) ontest 1. Similarly, analysis of Factualclas- 
sification showed that across both topics the 1o wer 
the Ss’ factual scores on test 1,the greater the amount 
of factual change on tests 2 and 3, 


A parallel situation is true of changes inattitude. 
Hypothesis groups had significantly higher attitude 
scores on test 1 than Therapy groups. Since the new 
learning argued for attitudes with high scores, the 
Therapy groups showed greater changes in attitude 
than the Hypnosis groups, Similarly the lower a S's 
attitude score on test 1 (according to Attitude classi- 


fication) across both topics, the greater theamount 
of attitude change. 


Change due to Mode Experience, and Feedback 


The superiority of lectures over r eadings for 
changing attitudes and factual beliefs may be because, 
particularly with controversial material, thelecturer 
can alter his presentations to fit each particularclass 
and because lectures more easily elicit emotional 
responses. However, generalizations are limited 
because there was only one lecturer and two readings. 


The effect of Experience was that the Ss with a 
medium amount of prior experience showedthe most 
factual change, those with no prior experiencethe sec- 
ond most, and those with considerable experience the 
least. The small factual change with high experience 
Ss is probably because these Ss already knew the 
most and had the least room for change and/or were 
the most personally involved in their position and 
hence the most resistant to change. 


One explanation for medium experience Ss Show- 
ing more factual change than low experience Ss isthat 
a small amount of experience (е. g., some popular 
reading) might give the S wrong information. If. this 
happened, the medium experience Ss would have more 
room for improvement than the low experience 55, 
Experience did not һауе а significant effect on test 1 
scores, А more probable explanation is that medium 
experience Ss had more interest in the topic than low 


experience Ss and hence learned more, but it can’t 
be said which came first, interest or experience, 


The significant effect of Feedbackon factual 
changes was impressive because its manipulation was 
quite minimal and the effects weren’t measured until 
after 6 weeks. Observation of Ss during the feedback 
session suggested they were quite interested in re- 
ceiving feedback even though they knew they weren’t 
being graded on the test. 


Associative Interference and Attitudes 


The 2-factor verbal-learning model of forgetting 
(4) predicts that over time pro-active inhibition ( PI) 
will increase relative to retroactive inhibition (RI). 
Thus it might be expected that there would be a change 
in the factual scores from test 2 to 3 in the direction 
of test 1. That іѕ, мћеп 5 learns aset of factual 
material (new learning) that is contradictory to his 
prior learning (as measured by test 1), then whentest- 
ed soon after the new learning (test 2), there is a 
change in performance in the directionof the new, 
learning. However, when tested considerably later 


(test 3) there should be a regression toward his prior 
position (test 1). 


The present experiment did not find such regres- 


sion, except for one attribute of Topic and Factual ы 


classification, Generally there was no evidence for 

any regression between tests 2 and 3, and in the case 
of the most dominant attribute of Topicand Factual 
classification there was an increase rather than are- 
gression. Also most theories of retention would pre- 


dict more forgetting during the 6 weeks betweentest 
2 and test 3. 


A possible explanation for the failure of the data 
to fit the simple interference model is that the time 
intervals employed in this study were significantly 
longer than those of the usual verbal learning studies. 
It might be argued that all predicted effects from the 
interference model are over by the time of test 2, 
There is areport (7) of an increase in the PI of 


descriptive prose up to 7 days, but this was with non- 
controversial material, 


А second explanation that ті 
ght be concluded from 
Undecwond and Ekstrand (11) and Mills and Winocur 
( ) is that the learning measured by test 1 was well 
earned under distributed practice and this didn't ex- 


Sily, decreasing th ive i i 
PI over time, 5 Б the relative increase in 


ү, Seriously, but I 

P. y accept thebehavior- 
reject the other, >» « According to your 
"E ove are w; t 
О give in that easily," годе, bu 


It i 
is further Proposed that 6 Weeks later, at the 
; 


"E 
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time of test 3, there is a reductionin resistance 
which more than offsets any forgetting. The reduc- 
tion may be due to a dissipation of the motivational 
state that resulted from the conflict betweenthe prior 
beliefs and the new learning and/or it could be due 
to a gradually built up confusion of the S about what 
he originally believed and the ideas of the new learning. 


Consider Bias classification which was not signif- 
icant on test 2 but was significant on test 3. On test 
3 those Ss who had judged the new learning to be the 
most biased showed the biggest factual change; while 


` those who had judged it to be the least biased showed 


the least factual change. The following set of rela- 
tionships might hold: The greater the difference be- 
tween S's test 1 scores and the new learning,the great- 
er the room for change but also the greater the con- 
flict. And the greater the conflict the greater the 
resistance to change and the more biased the Sis apt 
to perceive the new learning. Thus on test 2 Ss who 
perceive the new learning as very biased have the 
greatest room for factual change, but this is offset by 
their also having the greatest resistance to change. 
Hence Bias classification does not have a significant 
effect on test 2 factual scores. But on test 3 whenre- 
sistance has decreased, the Ss with the highest bias 
scores have the most room for change and in fact 
change the most, 


The effect of Topic on factual changes increased 
in significance from the . 025 level to the . 001 level 
between tests 2 and 3. Topic had a significant effect 
on bias scores. 


Any explanation for the resultsof the present 
study or what actually takes place in a classroom 
must take into account the interactions between the 
content and presentation of the new material and the 
attitude-belief structure of individual Ss. 


FOOTNOTES 


1.This study was conducted while the author was as- 
sociated with the Center for Research on Learn- 
ing and Teaching atthe University of Michigan 
The author would like to thank Staniord C. Ericksen 
for his support and suggestions. Requests for re- 
‚ prints should be sent to the Faculty of Psychology, 
anu ree of West Florida, Pensacola, Florida, 


2.Т һе effects of the variable Topic are confounded 
with the fact that the hypnosis and therapy tests 
were not equated for difficulty. But for the pur- 
pose of this study Topic is used only todetermine 


the relative performance on these particular tests 
without making inferences about howthepopula- 
tions respond to these topic materials. This is par- 
ticularly so since Topic did not significantly inter- 
act with any other variable. 
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ABSTRACT 


This study examined the influence of two widely used instructional programs on first-grade-level pupil 
achievement in economics. A total student population of 116 was studied. Six intact groups from four elementary 
schools in West Springfield, Massachusetts, were divided into three groups of two each. One group was assign- 
ed the Science Research Associates (SRA) materials for a full year, one group was assigned the use of the 
Follett materials for a full year, and the third group spent a semester using each of these programs, Data were 


gathered through the administration of the Spear’s Test For Achievement In Economics in a pretest-posttest fash- 
ion, The data collected were submitted to both analysis of variance and covariance techniques. The results of this 


study indicate that pupils taught with a full year of SRA materials achieved significantly higher than pupils taught 


with a full year of Follett. When intelligence and p 


retest scores were held constant the combination SRA and Fol- 


lett group scored significantly higher in pupil achievement than the Follett group alone. When teacher under - 


standing of economics concepts was held constant, 
Follett group on pupil achievement. 


WHILE а review of pertinent literature show- 
ed a number of studies concerned with teaching eco- 
nomics at the elementary level, empirical research, 
which has measured the effectiveness of instruction, 
has, until most recently, been virtually nonexistent, 
The 1960 volume of the Encyclopedia of Educational 
Research does not list a single study of economic ed- 
ucation at the elementary-school level. The consid- 
erable disparity in the number of research programs 
devoted to the elementary grades, as compared to 
intermediate, secondary, and adult levels, is due in 
part to the requirements for testing first-grade pu - 
pils. For example, criterion instruments andresearch 
designs have to be substantially different in testing 
first-grade pupils as compar ed to testing adolescents. 


Jefferds was among oe M ушшш ae 
i mpirically first-grade pu pi 
eta to Ee the under btanding of economics (2:41). 
m s study attempted to measure а specific and 
idely used set of instructional materials = the Senesh 
vide iy tis — published by SRA, to determine if children 
КЕШЕ мвігасбай by these new mater ials received any 


the combination group did not score significantly higher than the 


educational advantages over children taught by con- 
ventional methods. 


Based upon his findings, Jefferds concluded that: 
(1) * children did no better on the criterion test with 
packaged instructional materials than with local ma- 
terials”; and (2) ** there were no significant differ- 
ences in children’s understanding of economic con- 


cepts regardless of how the instructi i 
ei d tional unit was 


Recognition of this lai 
struments for ec. 


М 
Y 
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pupils places them at a disadvantage in thefirst- 
grade curriculum of Culver City, California. 


While there is existing empirical evidencethat 
first-grade pupils can comprehend economic concepts, 
there has been no attempt to differentiate amongthe 
increasing number of instructional programs at the 
elementary level. The rejection or adoption of in- 
Structional programs by school districts istherefore 
frequently based only upon the opinion of teachers and 
not upon-empirical evidence. 


PROBLEM 


The basic purpose of this study was to compare 
the effectiveness oí two widely used sets of social 
studies instructional materials inteaching economics 
to pupils at the first-grade level. The following null 
hypotheses were tested. (1) There will be no signifi- 
cant differences in achievement in economics for 
first-grade pupils among groups which are instructed: 


1. with SRA materials for 2 semesters (Group A)and 
with Follett materials (Group B) for 2semesters; 


2, withSRA materials for 2 semesters (Group A) and 
with Follett materials for 1 semester and SRA ma- 
terials the other semester (GroupC); 


3. with Follett materials for 2 semesters (Group B) 
and with Follett materials for 1 semester andSRA 
materials the other semester (GroupC ). 


(2) There will be no significant differences in achieve- 
ment in economics among pupils in different socio- 
economic levels within each of the groups or among 
the groups. 


SRA provides extensive instructional materials, 
largely prescriptive in design, which have as their 
objective teaching pupils basic social science under- 
Standing through the discovery process. TheSRA ma- 
terials consisted of the following items: (1) atext- 
book, Our Working World: Families at Work, 272 рр. ; 
(2) а 198-page resource unit for teachers containing 
aids, suggestions, and activities for the fifty-five les- 
sons in the textbook; (3) a 71-page picture workbook 
for the pupils; (4) a 49-page handbook for teachers 
which contained the written transcription of the rec- 
ords used to supplement each of the last twenty-seven 
lessons of the textbook. All of these materials were 
developed under the direction of Lawrence Senesh. 


The Follett program has similar goals; however, 
it is less extensive, less prescriptive, and allows 
greater flexibility for the teacher. Follett materials 
contain the following: (1) two textbooks for the pupils 
Billy's Friends, 143 pp. and Exploring with Friends 
168 pp. (The former text was used in the first semes- 
ter’s instruction; the latter in the second semester ); 
(2) a 63-page teacher's guide for Billy's Friends and 
a 64-page teacher's guide for Exploring with Friends. 
Both guides contained objectives, teaching aids, and 
suggestions for a first-grade social studies program. 


, 


DESIGN 


This study utilized six intact, first-gradeclass- 
es from four elementary schools of West Springfield, 
Massachusetts. Pupils fromthe six classes werecom- 
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bined into three groups, and each group was instruct- 
ed with a different set of social studies instructional 

materials. The instrument used to measure the effec- 
tiveness of the three programs of instructional mate- 
rials was Spears’ Test for Achievement in Economics. 


The scores of the 116 pupils in the study were an- 
alyzed according to the program with which they were 
instructed, and their socioeconomic status. Pupils 
were pretested in January 1969, and posttested in May 
1969. 


Pupil social position was determined by using 
Hollingshead’s Two Factor Index of Social Position. 
This method utilizes the occupational and educational 
levels of thefamily’s main wage earner as determi- 
nants of social position. A pupil information form was 
developed to secure this information. Scale values were 
found for the two factors, and these values were then 
statistically treated to provide an index of each pupil’s 
social position. Pupils were classified into three so- 
cioeconomic levels: (1) level one represented pupils 
of the uppermiddle-class;(2) level two represented 
pupils of the middle-class; and (3) level three rep- 
resented pupils of the lower middle-class. 


Analysis of variance treatment was used for de- 
termining if significant differences in achievement in 
economics existed among the groups. Two-way anal- 
ysis of variance treatment was used for determining 
if significant differences in achievement in economics, 
based on socioeconomic status, existed within or among 
the groups. Analysis of covariance treatment was used 
to secure statistical equalization on certain relevant 
variables which could have confounded the relation- 
ships under investigation. 


DETERMINING TEST INSTRUMENTS 


The instrument used to measure achievement of 
first-grade pupils in economics was developed by Sol 
Spears. The test consisted of twenty-six multiple- 
choice items. 


In consideration of the maturity and attention span 
of first-grade pupils, this test takes approximately 
30-33 minutes to administer. Due to the limitedread- 
ing ability of first-grade pupils, the instructions for 
the pre- and posttest were tape-recorded inorder to 
make them identical for all groups. Items for the in- 
strument were obtained from a review of sources in- 
cluding textbooks, course outlines, statements of ob- 
jectives, and questions from other tests. 


A jury of economists who are subject matter ex- 
perts in economic education —Dr. Norman Townshend- 
Zellner and Dr. John Lafky —examined the economic 
content aspects of the test, and endorsed it for valid- 
ity. Only those items on which there had been com - 
plete agreement by both judges were included in the 
test. The reliability coefficient for Spears' Test for 
Achievement in Economics is . 78, which was deemed 
acceptable for group testing purposes. 


SELECTION OF INSTRUMENT MEASURING S 
POSITION dcs 


.  Hollingshead's Two Factor Index of Social Posi- 
tion was chosen as the instrument for determining pu- 
pil social position (1:116). This instrument met the 
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need for an objective, easily applicable procedure to 
estimate the position individuals occupy in the status 
structure of our society. 


According to Hollingshead, the following break- 
down is meaningful for predicting the social class po- 
sition of an individual: 


Social Class Range of Computed Scores 
Sorini Ses Range of Computed scores 


I 11-17 
Il 18-27 
Ill 28-43 
1V 44-60 
у 61-77 


Social class I represents the highest social position, 
and social class V the lowest. 


For purposes of this study, an arbitrary delin- 
eation was made for pupil groupings. Level 1 was com- 
posed of pupils falling within social classes Iand II. 
Level 2 was composed of pupils located within social 
class III. Level 3 was composed of pupils falling 
within social classes I V and V. 


In order to determine the socioeconomic status 

of the Ss, a pupil informationform was developed. All 
pupils were requested to take the form home for a 
parent (or guardian) to complete. To secure parent co- 
operation, a covering letter which briefly described 
the purpose of the study, and firmly assured the par- 
ent that no names would be used in the study, was sent 
to each pupil’s home. All the forms were returned, 
and every parent supplied the desired information. 
The data obtained from the form were used to estab- 
lish a scale score which was then weighted to give 
pupils a social class position. 


PUPIL POPULATION AND TEACHER SELECTION 


The Ss for this study were select- 
edírom the first grades of four elementary 
schools in West Springfield, Massachusetts, a suburb 
of 26,070 in 1965, The district has nine elementary 
schools with an enrollment of 3, 097 pupils. Thereare 
eighteen first-grade classes in the district. 


The six teachers who participated in the study 
were selected at random. АП of the eighteen first- 
grade teachers had indicated a willingness to partic- 
ipate in the study. However, only fifteen met thefol- 
lowing criteria: (1) completion of 3 or more years of 
successful teaching, and(2) a rating, by their ad- 
ministrators, of above average or excellent, 


Course background in economics was very sparse 
for the six teachers. Two teachers had a 3-hour 
course more than 10 years before, two teachers had 
an in-service course in economics offered by Boston 
University in 1964-1965, and two teachers had never 
had any course work in economics. Thus, the mean 
hours of college course work completed in econom- 
ics by the six teachers, was less than 3. 


All elementary teachers in the system experi- 
enced a 30-hour in-service training program inthe 
teaching of economics. The same economist taught 
all the teachers, and at the conclusion of the train- 
ing program, all teachers were given the Teacher's 


Economic Understanding Test. The results of this 
test provided data used to equate, statistically, the 
influence of any variance in teacher knowledge of eco- 
nomic materials. We 


Prior to the actual pupil testing time, the re- 
searcher met with each participating teacher to ex- 
plain the purposes and procedures of the testing pro- 
gram. At that time it was made clear that only the ef- 
fectiveness of the instructional materials, notteacher 
competence, was to be measured. 


ADMINISTRATION OF TEST INSTRUMENT 


January 16 and 17, 1969, were selected for the 
administration of the pretest to all Ss. The posttest 
was administered May 28and 29, 1969. The teachers 
never saw the measuring instrument used in the 
study, nor were there any meetings between the 
teachers and investigators from January to May. 


The investigators scoredallof the tests in order 
to insure consistent scoring. АП of the 116 tests 


were usable, and formed the basis from which data 
was obtained. 


ANALYSIS OF THE DATA 
Major Hypothesis 


There are no significant differences in pupil 
achievement as measured by the Spears'test among 
pupils exposed to SRA materials for 2 semesters 
(GroupA), pupils exposedto Follett materials for 2 
semesters (Group B), and pupils exposed to SRA ma- 
terials for 1 semester and Follett materials theother 
semester (GroupC ). 


Findings 


Using the raw scores pupils received onthe Spears’ 
test, an analysis of variance was applied to deter- 
mine the F value of the difference. The data cited in 
Table 1 shows the test of significance withtheSpears' 
test among the groups. The F valuefor the difference 


TABLE 1 


ANALYSIS OF VARIANCE FINDINGS ON POSTTEST 


ACHIEVEMENT SCORES GROUPED ACC 
V ORDINGTO 
INSTRUCTIONAL MATERIALS 


Source of df Sum of 

Add Mean Sum F 
Variation Squares ofSquares Ratio 
Instructional 
Materials 2 107,83 53.92 8. 84* 
Error 113 688.96 6.10 ғ 


* Significant at the . 001 level of confidence. 


, 


was determined to be 8. 84, and the Е table indicated ,” | 
that 7, 37 (an approximate interpolated value) was 
needed to be significant at the .001 level of confi- 
dence. The null hypothesis of no significant differ- 


ences among Groups A, B, and C was therefore, 
reject ed. 


= 
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The Scheffe method was applied to the posttest 
scores of all the groups and the findings were: 


1. Group A(SRA materials all year) scores were 
significantly greater than Group B ( Follett ma- 
terials all year ) scores; 


2. Group A scores were not significantly different 
from Group C ( Follett materials 1 semester and 
SRA materials 1 semester ) scores; 


3. Group B scores were not significantly different 
from Group C scores. 


The null hypothesis of no significant difference in 
achievement in economics between pupils of Group A 
and Group B was, therefore, rejected. 


The null hypothesis of no significant difference in 
achievement in economics between pupils of Groups A 
and C failed to be rejected. The null hypothesis of no 
significant difference in achievement in economics be- 
tween Groups B and C failed to be rejected. 


At first glance, the strong findings favoring the 
SRA materials appear convincing. However, tests 
were conducted on relevant variables to determine 
whether or not the groups were truly matched. There- 
fore, analyses of variance were conducted for age in 
months, intelligence scores (IQ), socioeconomic status 
(SES), and pretest scores. These variables were 
chosen because much of educational literature is con- 
cerned with the relationship of chronological age, in- 
telligence, and socioeconomic status, to learning abil- 
ity. Pretest scores were used because they are also 
signifiċant measures of the groups’ equivalence at the 
start of the study. 


Table 2 presents the means, standard deviations, 
and F ratios of the variables mentioned in the previ- 


ous paragraph, grouped according to instructional ma- 
terials (IM). 


No significant differences based on age or on so- 
cioeconomic status were found among the groups. Sig- 
nificant differences were found among the groups for 
intelligence scores and pretest scores. 


TABLE 2 


Again the Scheffe method was applied to the in- 
telligence scores to determine which group was sig- 
nificantly different from another. The findings were: 


1. Group А was not significantly different from 
Group B; 


2. Group А was not significantly different from 
Group C; 


3. Group B was significantly greater thanGroupC. 


The Scheffe method was also applied to the pretest 
scores, and the findings were: 


1. Group A was significantly greater than Group B; 
2. Group A was significantly greater than GroupC; 


3. Group B was not significantly different from Group 
с. 


Socioeconomic Status 


Table 3 presents the number of Ss, meanscores, 
standard deviations, and Fratios with the variables — 
age (in months), intelligence scores, pretest scores, 
and posttest scores —grouped according to socioeco- 
nomic status. 


No significant differences were foundforthe vari- 
ables listed in Table 3, when pupils were grouped ac- 
cording to socioeconomic status (SES). An F ratioof 
3.11 (an approximate interpolated value) at the . 05 
level, is needed for significance. The data cited in 
Table 3 show none of the F ratios to be significant. 
The null hypothesis of no significant differences in 
achievement among pupils grouped according to so- 
cioeconomic status failed to be rejected. 


Possible Interaction Between Socioeconomic Status 
and Teaching Materials 


Popham stated on interaction: “Тһе general prin- 
ciple involved in interaction effects is the same inall 
analysis of variance models, whenthe research is 
testing for the existence of a relationship betweenthe 


SUMMARY OF FVALUES FOR VARIABLES GROUPED ACCORDING TO INSTRUCTIONAL MATERIALS 


B B pc F 

X 54 x sd x sd Ratio 
Age 86, 00 3.61 84. 03 4.67 84. 03 5.24 2.39* 
IQ 106. 03 12.07 109.21 9.22 101. 69 10. 03 4, 69% 
SES 2.45 0. 76 2.40 0. 87 2.58 0. 68 0. 55* 
Pretest 11.74 2.40 10. 43 2.07 9. 89 2.56 6. 21*** 


* Not significant, 
жж Significant at .05 level of confidence. 
*** Significant at . 01 level of confidence. 
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TABLE 3 | 

SUMMARY OF F VALUES FOR VARIABLES GROUPED ACCORDING TO SOCIOECONOMIC STATUS 6 

Е | 

Level П Level m | | 

„зе ш Ғғ ош n X а maio | 

3 | 

Аке 20 84.35 4.58 21 84.09 401 74 84.93 4.80 0, 33 | 

IQ 20 110.60 8.22 19 104.58 9.29 71 104.69 11.61 2.51% | 

Pretest 20 10.70 2.34 21 11.29 2,53 15 10.51 2,46 0, 83* | 

Posttest 20 11.60 3.00 21 12.42 2.04 75 11.42 2,67 1, 20* | 
*Not significant at the . 01 level of confidence. 


dependent variable and another variable” (3:129). In 
accordance with Popham's observation, tests were 
performed, and F ratios were obtained to determine 
possible interaction of socioeconomic status and in- 
Structional materials with age, intelligence Scores, 
pretest scores, and posttest scores. 


Sex 


No significant differences Were found for the vari- 
ables listed in Table 5 when pupils were grouped ac- 
cording to sex. To be significant at the . 05 level, an 
F ratio of 3.11 (an approximate interpolated value) is 
needed for significance, The data cited in Table 5 
show none of the F ratios to be significant. 


DISCUSSION 


Data resulting from this Study indicate that the 
pupils of Group A who were instructed with the SRA 
materials for 2 semesters, scored significantly high- 
er on the posttest of Spears’ Test for Achievement in 
Economics than the pupils of Group B, who were in- 
structed with Follett materials for 2 semesters. How- 
ever, GroupA did not score Significantly higher onthe 
achievement test when compared to pupils of GroupC, 
who were instructed with Follett materials for 1 se- 
mester and SRA materials the other semester. Also, 


GroupC did not score significantly higher than Group 
B on the posttest. 


TABLE 4 
SUMMARY OF F RATIOS FOR Pos 


ACTION BETWEEN SOCIOECONO 
TEACHING MATERIALS 


SIBLE INTE R- 
MIC STATUS AND 


Sources of F 
Variance Variance Ratio 
Age SES x IM 0. 65* 
IQ SES x IM 0. 54* 
Pretest SES x IM 0. 02» 
Posttest SES x IM 0. 76* 


ж Not significant at . 05 level of confidence, 


No significant differences were found to exist | 
among the groups on age, intelligence scores, socio- | 
economic status, pre-and posttest Scores, whentwo- | 
way analysis of variance was applied to pupils’ scores, 
within and among the groups, for possible interactions 


between socioeconomic status and the instructional ma- | 
terials, 


No significant differences were found to exist 
among the groups on age, intelligence scores, socio- 
economic status, pre-and posttest scores, when the 
data were analyzed according to sex, 


Treatment of Rival Hypotheses 


As Table 2 indicates, significant differences did 
exist among groups on the variables of pupil intelli- 
gence and pretest scores, This being so, two rival 


hypotheses for the differences found among the groups 
on the posttest could be formulated, i, е, : 


1. The differences n 
tional program e 
ferences in pupil 


oted were not due to the instruc- 
mployed, but rather to the dif- 
intelligence; 


2. The differences noted among groups were auem s 
to the instructional 


and an additi 


a onal rival hypothesis also presents 
itself, i. e, : 


3. The differences among groups noted was not due 
to the instructional Program employed, but rather 
to the fact that the teachers varied in their under- | 
Standing of the Substantive materialtaught (eco- ./ 
nomics), 


of the rival hypotheses was investigated, 


ysis of covariance to hold constant the ef - 
Variable in question, 


Each 
using anal 
fect of the 


equivalent (p < . 08 1 Р 
ores. Therefore, Р 
ch partialled out pupi 


OSttest results, was applie 
се. 


with respect to i 


ysis of Covariance meth, i 
intelligence Scores on oe d 
to analyze this differen 
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TABLE 5 


SUMMARY OF F VALUES FOR VARIABLES GROUPED ACCORDING TO SEX 


Male Female 
(n=63) oe (n=53) Е 
Ж. ѕа X sd Ratio 

Age 84. 61 4. 63 84.75 4. 62 0. 03* 
IQ 104. 86 9.92 106.76 11.86 0. 84* 
SES 2.38 0. 79 2.58 0.75 2,02% 
Pretest 10. 76 2, 22 10. 58 2.72 0.15* 
Posttest 11.59 2.76 11.70 2.50 0.05% 


* Not significant at . 05 level of confidence. 


At the . 001 level, an F ratio of 7. 54 (anapprox- 
imate interpolated value) is needed for significance. 
The data cited in Table 6 shows the Fratioto be 
10.23, Therefore, a significant difference still ex- 
isted among the groups for posttest scores, withthe 
effects of intelligence scores partialled out. Again, 
as in the analysis of the data presented in Table 2, 
it was not possible to determine which group was sig- 
nificantly different from another without application 
of the Scheffe method. 


The Scheffe method was applied and with pupils" 
intelligence scores partialled out, the findings from 
the posttest scores were: 


1. Group Awas significantly greater than Group B, 


2. Group A was not significantly different from 
Group C, 


3. Group C was significantly greater than Group B. 


These findings indicate that pupils instructed with 
SRA materials for 2 semesters, and pupils instruct- 
ed with Follett materials for 1 semesterand SR A 
materials the other semester, achieved higher scores 
on the achievement test than pupils who were instruct- 
ed with Follett materials all year. 


TABLE 6 


ANALYSIS OF COVARIANCE FINDINGS ON POST- 
TEST ACHIEVEMENT SCORES GROUPED ACCORD- 
ING TO INSTRUCTIONAL MATERIALS WITH IN- 
TELLIGENCE SCORES PARTIALLED OUT 


Corrected Corrected 
Source of Sums of Mean F 
Variation а Squares Square Ratio 
Instructional 
Materials 2 114, 33 57.16 10. 23* 
Error 112 625. 70 5.59 


* Significant at the . 001 level of confidence. 


Rival Hypothesis П. The differences noted were 
due to the fact that groups did not start at an equal 
point (as demonstrated in the pretest analysis), not 
due to the instructional program employed. Table 
2 indicates that the three groups were not equivalent 
(p= .01) with respect to pretest scores. Therefore, 
analysis of covariance method, which partialled out 
pupils’ pretest scores on posttest results, was applied 
to analyze this significant difference. 


An F ratio of 4, 89 (an approximate interpolated 
value) is needed for significance at the . 01 level. The 
data cited in Table 7 showthe Е ratioto be 7.17. 
Therefore, significant differences still existed among 
the groups for posttest scores with the effects of pre- 
test scores partialled out. Again, as in the analysis 
of the data presented in Table 2, it was not possible 
to determine which group was significantly different 
from another, without application of the Scheffe meth- 
od. The Scheffe method was applied, and the findings 
from posttest scores, with pretest scores partialled 
out, were: 


1. Group A scores were significantly greater than 
Group B scores; 


2. Group A scores were not significantly different 
from Group C scores; 


3. Group C scores were significantly greater than 
Group B scores, 


These findings indicate that pupils instructed with 
SRA materials for 2 semesters, and pupils instructed 
with Follett materials for 1 semester and SRA mate- 
rials the other semester, achieved higher scores on 
the achievement test than pupils who were instructed 
with Follett materials all year. These indications were 
also shown by the findings presented in Table 6. 


Rival Hypothesis Ш. The differences noted were 
due to the differences which existed in the teacher’s 


understanding of the substantive material taught ( eco- 


nomics), and not due to the instructional material em- 
ployed. 


Although the data of Table 2 do not 


a " 
tests relating to teachers’ pnma 


understanding of economics 
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TABLE 7 


S ON POST- 
ALYSIS OF COVARIANCE FINDING 
TEST ACHIEVEMENT SCORES GROUPED ACCORD- 
ING TO INSTRUCTIONAL MATERIALS WITH PRE- 
TEST SCORES PARTIALLED OUT 


Corrected Corrected 


Source of Sum of Mean | 
Variation df Squares Square Ratio 
Instructional 

Materials 2 82.41 41.21 е 
Еггог 112 643,33 5.74 


Азына RR 


* Significant at the . 01 level of confidence. 


asavariable, which could confound the results of the 
study, itseemsthat observation of thefindings might 
pose the hypothesis that teacher understanding of eco- 
nomics would have an effect. As mentioned earlier, 
each pupil was assigned a score indicating his own 
teacher’s level of understanding of economics.T each- 
ers’ level of understanding of economics was meas- 
ured by the Test of Economic Understanding ( TEU): 


The data showed an F ratio of 6.69. For the F 
ratio to be significant at the .01 level, an F ratio of 
4. 89 (an approximate interpolated value) was needed. 
Therefore, the data showed that differences among 
the groups on posttest scores were still significant 
with the teachers’ understanding of economics par- 
tialled out. However, as in the cases of the intelli- 
gence scores and the pretest scores, it was not pos- 
sible to determine which group was significantly dif- 
ferent from another without application of the Scheffe 
method, The Scheffe method was applied and the 
findings from posttest scores, with teachers’ T E U 
scores partialled out, indicated: 


1, Group A was significantly superior to Group B; 


2. Group A was not significantly superior toGroup 
с; 


3. Group B was not significantly superior to Group 
C. 


These findings indicate that pupils instructed with 
SRA materials for 2semesters achieved significantly 
greater scores than pupils instructed with Follet ma- 
terials all year. However, in this instance (TEU scores 
partialled out of posttest scores), pupils who were in- 
structed with Follett materials for 1 semester and 
SRA materials the other semester did not achieve 
significantly higher scores than pupils instructed with 
Follett materials all year. 


SUMMARY 


For pupils’ scores on achievement posttest, based 
on instructional materials, data resulting from this 
study indicate: 


1, First-grade pupils of West Springfield, Mas- 
sachusetts,who were instructed withSRA mate- 
rials all year, achieved consistently higher 
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scores onthe measuring instrument than pupils in- 
structed all year with Follett materials. 


2. In two out of three instances (when intelligence 
scores and pretest scores were partialled outof 
posttest results), pupils who were taught with 
both Follett and SRA materials achieved higher 
scores on the measuring instrument than pupils 
instructed all year with Follett materials. 


3. In one case out of three (when TEU scores of teach- 


ers were partialled out of posttest results), pu- 


pils who were instructed all year with Follett ma- 


terials equaled achievement of pupils taught with 
Follett materials 1 semester and SRA materials 
the other semester. 


For pupils’ scores based on socioeconomic sta- « 


tus, data resulting from this study indicate that no 
significant differences existed within and among the 
groups in posttest achievement when students were 
grouped according to social position. 


For pupils’ scores based on sex, data resulting 
from this study indicate that no significant differences 


existed among the groups in posttest achievement 
scores based on sex. 


CONCLUSIONS 


| Based upon the data collected and analyzed in 
this study, the following conclusions were drawn: 


1. Pupils instructed with SRA materials achieved 
consistently higher scores on a test of achieve- 
ment in economics than did pupils who were in- 
structed with Follett materials. 


fect performance on the achievement test. This 


result does not support the results of another 


study involving SRA materials which found that 
pupils of lower socioeconomic status performed 


at a lower level than pupils of middle socioeco- 
nomic levels. 


. А combination of instructional materials — 1 se- 
mester with SRA materials and the other 


with Follett materials-seems to result ina levelof 


TABLE 8 


ANALYSIS OF COVARIANCE FINDINGS ON POST - 


TEST ACHIEVEMENT SCORES GROUPED ACCORD- 


ING TO INSTRUCTIONAL MATERIALS WIT U 
SCORES PARTIALLED OUT de 


Sintesi Corrected Corrected 


1 Sums of Mean 

Variation dí Squares Square Ratio 
Instructional 

Materials 2 81,55 40. 78 e. 69* 
Еггог 112 682, 90 6.10 


и 
Significant at the . 01 level of confidence 


« 


. The socioeconomic status of the pupils did not ef- 


A 
LE 


SCHUCK and DEROSIER 


achievement in economics equal to that of pupils 
instructed with SRA materials all year. 


4. The evidence provided by this studyindicates 
that pupils can learn economics through an in- 
terdisciplinary approach. 


IMPLICATIONS 


This study did not attempt to assess the SRA or 
Follett materials ** intoto. " This study was concern- 
ed only with comparing the effectiveness of the ma- 
terials on achievement in economics. However, these 
materials are concerned with much more than eco- 
nomics —geography,anthropology, political Science, 
Sociology, and history are interrelated parts of each 
Set. Thus, a need exists to investigate how these in- 
terrelated parts work together —as a whole —to accom- 
plish the purposes for which the materials exist. 


Systems analysis asks the educator to see his ac- 
tivity as a whole —not only the instructional materials 
but also the child, the curriculum, the media, the 
teacher, and the management network which puts 
these and other resources together into a functional 
system. Educators might then acquire needed meas- 
urements on expenditure of energy and resources. 
Therefore, a new approach to the materials problem 
might be to think assiduously in terms of the way ma- 
terials relate to the entire educational process. 


Recommendations for Further Research.Recom- 
mendations based on the data and observations of this 
Study are as follows: 


(1) Using these first-grade social studies instruc- 
tional materials, a full year study should be 
made to determine the effect of greater time 
duration on achievement in economics. 


(2) A study should be made involving the Follett 
and SRA materials with groups selected by a 
random sampling method.As noted inthe study, 
even with intelligence scores statistically equa- 
ted, the pupils instructed with SRA for 1 se- 
mesterand Follett the other semester, achieved 
scores on the achievement test significantly 
greater than pupils who were instructed with 
Follett materials all year. More data should 
be obtained on this variable, because the pu- 
pils instructed with Follett materials possessed 
significantly higher intelligence scores. Inad- 
dition, the variable's ** suppressing ” or **mod- 
erating” effect on the posttest scores needs 
more investigation, 


(3) А study should be conducted to assess more 
precisely the effect of teacher training on the 
teaching of these materials. ** Teacher proof ” 
instructional materials need much more em- 
pirical examination to warrant that label. There- 
fore, further research on the SRA and Follett 
materials should involve both teachers who have 


(4) 


(7) 
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experienced in-service training in teaching 
economics, and teachers who have experi- 
enced no training at all in teaching economics. 


Both the instructional materials investigated 

in this study pursue the learning of economics 
through an interdisciplinary approach. Astudy 
which compar ed the achievement of first-grade 
pupils who were taught economics as an inde- 
pendent discipline, to pupils who were taught 
economics through an interdisciplinary ap- 
proach, could provide evidence of the effective- 
ness of the two methods. 


As stated elsewhere in the study, boththeSRA 
and Follett materials are concerned with atti- 
tudes and values. There has been virtually no 
research on the effectiveness of these materi- 
als intheareas of attitudes and values. It would 
beof considerable interest to investigage wheth- 
er or not affective changes take place with pu- 
pils who have been taught with these materials. 


There is a need to assess these materials with 
urban pupils, because most of the studies which 
have measured first-grade pupil achievement 
in economics have focused on white suburban 
children. 


In order to confirm or reject the findings of 
this exploratory study, parts of it Should be 
replicated in other sections of the country with 
new variables introduced. There is great need 
for more empirical evidence on other materi- 
als, other methods, and other objectives. The 
major problems of educational researchareso 
big and so complex that breakdowns into minor 
problems might yield findings which are signif- 
icant for solving the grand problem. 
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ABSTRACT 


Alpha-numeric fonts have hit] 
reading material is achieved thro 
quency spectra and cross correlation of font 
they provide a sound basis for differentiation 
between type fonts or within a given font. 


herto been designed with an aesthetic end in view. 
ugh statistically derived evaluations. 


Symbols can be quantitatively measured in the laboratory and that 


The method also allows the optical spectrum to be altered, and a mod- 
ified or new letter to be constructed or ге; ith i 


THE TEACHING of reading continues to re- 
ceive major attention in educational research and 
development. This is due in part to the fact that 
ability to read well is basic to academic success and 
appears to be a controlling factor as to the vocation 
a student will follow and the type of life he will lead. 
Equally responsible for this interest in reading is the 
frustration educators feel at the puzzling lack of suc- 
cess in learning to read experienced by a large num- 
ber of pupils, particularly boys. No one method and 
no single set of materials for the teaching of reading 
have thus far proved equally effective for all pupils, 
Thus research continues to focus on all aspects of 
the problem, with notable emphasis upon the reading 
process itself. 


The primary goal in reading may perhaps be de- 
fined as the instantaneous transfer of thought from 
the printed page to the reader's mind through a vi- 
sual and neurological process. It has been noted 
that the recognition of letter symbols by the eye is 
a function that frequently causes difficulty for begin- 
ning readers, especially with letters that represent 
reversals or inversions of characters of Similar 
Shape (p and q, m and w, d and b, etc. ). The au- 
thors propose the hypothesis that a quantitative mea- 
sure of letter symbols is derivable. The following 
questions immediately arise: Can a minimum quan- 
titative standard of difference be established for let- 
ter symbols? Can type design be developed or re- 
vised for letters showing a high degree of similarity 
to make it easier for the eye to differentiate among 
them? Would an alphabet with greater differences 
in recognition among letter symbols be a help to pu- 
pils learning to read and prevent reading blocks 


The brief review by Gray (2:1108-1111) considers 
many aspects of font and type size as functions of 
preference, prevalence, 


Our p: 
the members of at leas, 
Writer uppercase font) have inter 
and measurable correlation. 


METHOD 


generated with an Optical cor- 


1 ( ; The letter Symbol to be analyzed 
18 contained in the System as a white-on-black film 


e r portions of the fil; llect- 
ed with a transform lens. apte 


The spati enc 
Spectrum of the letter symbol Soe Me ^ Tight 
distribution pattern in the back focal plane of the lens. 
axis. Other frequencies 
ace are measured radially 


in the two dimensional Sp: 
A light detector place anywhere in 


Írom the axis, 


x 


| 


E 


| 


——— 
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FIGURE 1 


SCHEMATIC DIAGRAM OF OPTICAL SYSTEM FOR MEASURING CORRELATION OF TWO LE TTER 


> SYMBOLS 


COLLIMATING (х,у) IMAGING 
OPTICS LENSES 
у ——._ 


TRANS PARENCY 
OF FIRST LETTER 


this plane measures the two dimensional light pat- 
tern so obtained. 


Asimplified arrangement of the optical components 
used to measure the correlation between two letter sym- 
bols is shown in Figure1, Bothletter symbols are pre- 


sented as film transparencies. An image of the first sym- 


bol (В) is projected onto the second (W). Laser light 
that has been transmitted by both film strips is collect- 
ed by the transform lens, The 2-dimensional light dis- 
tribution pattern observed inthe back focal plane of this 
lens now represents the spatialfrequency spectrum of 
the super posed letters. 


Let the real functions f, (x, y) and f(x, y) repre- 
sent the black and white density variationsonthe first 
and second film transparency, respectively. The 


light intensity I measured at points in the back focal 
plane is given by 


I= губу) 6, (х + Ey + А 


Т) exp (j vx) exp (j Ay) ахау 
а) 


where ё and п refer to the x- and y-displacements 
of the first transparency with respect to the second. 
The parameters v and ^ are called spatial frequency 
variables and are proportional to the position coordi- 
nates in the back focal plane. Тһе light intensity at 
the focal point, i.e., у = 0 = А, becomes 


I, = £, Gy) fy + &,у + n)dxdy 
(2) 


The measured intensity I, is related to the square 
of the correlation of the letter symbols when both sym- 
bols are oriented at the same angle (4:89). 


Restructuring either or both letter symbols re- 
sults in changes in the correlation values between 
them. Altering the color or wavelength of the illu- 
minating light also conveniently changes the frequen- 
cy scale of the light intensity distribution. Thisis 


TRANSFORM 


£, (x,y) LENS 


DETECTOR 
ELECTRONICS 


equivalent to increasing (or decreasing), without 
distortion, the size of the recorded letter symbol. 


TRANSPARENCY 
OF SECOND LETTER 


Optical correlation devices utilizing conventional 
(white) light sources measure only the correlation 
given by equation 1. A laser illuminator also makes 
possible measurement of the mathematical correla- 
tion and spatial frequency content of the letter sym- 
bols. This is a consequence of the color purity and 
coherence properties characteristic of the laser, 


RESULTS 


The letters B and W from an IBM typewriter font 
were chosen to illustrate the method. Typed sym- 
bols were photographed as negative transparencies; 
Figures 2 and 3 show the spatial frequency spectra 
obtained. Certain features observed in the spectrum 
are directly attributed to details of the letter symbol. 
For example, the main vertical bar of the letter B 
generates the horizontal pattern surrounding a bright 
central spot in the spectrum. Horizontal portions of 
this letter are responsible for the vertical frequency 
structure. Similarly, the horizontally-oriented ser- 
ifs of the W yield the vertical structure in the spec- 
trum. Slanted lines in this letter symbol generate 
the intricate patterns located on both sides of the cen- 
tral spot. The absence of a horizontal pattern in the 
W-spectrum is to be expected since this letter sym- 
bol has no vertical bar in its composition. Circular 
characteristics in a letter symbol yield a system of 
concentric rings in the spectrum. The degree of cir- 
cularity is dependent upon the extent to which circu- 
lar components are present in the letter, This can 
be seen in the spectrum of the letter B shown in Fig- 
ure 2. The detailed composition of the Spatial fre- 
quency spectrum depends on the length, thickness, 
and orientation of each bar, the number of parallel 
elements in the letter symbol, and the spacing be- 
tween elements. The curvature of the letter and the 


number of circular parts also contribute to the spec- 
trum. 


Figure 4 shows the spatial frequency spectrum 
obtained from the Superposition of the letters Band 
W. The intensity of the bright center Spot is a mea- 
Sure of the correlation between the letters. Note 
that the spectralpatterns Surrounding the center Spot 
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FIGURE 2 


OPTICAL FREQUENCY SPECTRUM OF THE 
ROMAN LETTER B 


FIGURE 3 


OPTICAL FREQUENCY SPECTRUM OF THE 
ROMAN LETTER W 


are more complicated for this B-W combination than 
for the individual letter symbols alone. 


DISCUSSION 


Criteria for the design of an alphabet structure 
should include the visual response of the eye. Mea- 
surements of the relative response of the human vi- 
sual system have been reported (3). Figure5 shows 
their frequency response curve for the eye as afunc- 
tion of spatial frequency on the retina. The relative 
response factor is defined as the ratio of the ampli- 
tude of the luminance variation to the mean value of 
luminance. 


The visual response curve exhibits a relatively 
broad maximum at a retinal spatial frequency slight- 
ly above 10 cycles/mm. The response has fallen by 
two orders of magnitude at a retinal spatial frequen- 
cy of about 200 cycles/mm. The decrease in re- 
sponse at frequencies below 10 cycles/mm. is less 
pronounced. Although these results are subject to 


FIGURE 4 


OPTICAL FREQUENCY SPECTRUM OF THE 
CORRELATIVE COMBINATION OF ROMAN 
LETTERS B AND W 


variations and uncertainties, they can be used as a 
guide in a design program. 


А program to revise or restructure an alphabet 
based on prescribed correlative relations subjectto 
practical constraints should overcome many of the 
shortcomings of the present alphabetic structure. 
The spatial frequency content in the spectrum of a 
given letter symbol should cluster about a value for 
which the visual response is a maximum, i.e., about 
10 cycles/mm. Experimentally, one attempts to 
maximize the light falling in a circular ring sur- 
rounding the zero frequency component of the ob- 
served spatial frequency spectrum. The center га- 
dius of this ring corresponds to the peak spatial 
frequency value of 10 cycles/mm. A preassigned 
spatial frequency band determined by the width of 
the circular ring would be a laboratory-controlled 
parameter. Individual letter symbols should be 
made distinct so that the correlation of any two re- 
structured symbols would be low. A quantitative 
measure of correlation is readily achieved by mea- ғ 
Suring the light intensity at zero frequency of two 
Superposed letters, as described above. 


Allowable levels of corr 


elati a system 
parametar. ation would be a sy: 


The alphabet should not, of course, be 
FIGURE 5 
SPATIAL FRE 


QUENCY RESPONSE OF EYE IN 
AVERAGE SAMPLE ACC AND 
DE PALMA (1961) МИКИ 
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100 400 
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modified to such an extent that it no longer resem- REFERENCES 
bles existing styles, since this would be impracti- 

Д, cal and self-defeating. In view of the wide variety 1. Berry, W. T.; Johnson, A. F.; Jaspert, W. F.; 
of existing fonts (1), a practical approach would be The Encyclopedia of Type Faces, Pitman Pub- 
to select letter symbols and styles from those sources lishing Company, New York, 1962. 
that satisfy the above constraints. One could, for 
example, analyze for spatial-frequency the Hoffman 2. Gray, W. S., in Harris, E. W.(ed.), Encyclo- 
and other current alphabets designed for beginning pedia of Educational Research, Macmillan, New 
readers to measure quantitatively how wellthey meet York, 1960. 
the guidelines stated. А 

3. Lowry, №. M.; DePalma, J. Ј., “біле Wave Re- 
FOOTNOTES sponse of the Visual System: I. The Mach Phe- 
nomenon, ’’ Journal of the Optical Society of 
1. Superintendent of Schools, UFSD Number 23, America, 51:740-746, 1961. 
Wantagh, New York 11793. 
4. Stroke, G. W., An Introduction to Coherent Op- 
2. Research Department, Grumman Aerospace tics and Holography, Academic Press, New 
A Corporation, Bethpage, New York 11714. York, 1966. 
BOOK REVIEWS Continued from page 18 


for the development of a mature, rational morality.‘* Psychological Standpoint "" focuses on six underlying assump- 
г tions about children and learning. “ Teachers and Children ” provides a description of a day-long observation of 
one first school classroom in an attempt to provide situational settings for the theoretical aspects of the book. 


Weaving throughout the book is a call for the availability of a variety of materials which can be used in a vari- 
ety of environmental settings and which will allow a variety of children to select their own pace and method of 
learning and discovering. 


But knowing Piaget does not justify the authors' existence. Their forte should be new insights into classroom 
application. This is where, as often happens, Piaget is used merely as a backdrop for someone else's ideas. A 
few examples should suffice. 


To say, “teachers must rely less on apparatus and methods and more on the child himseli—the knowledge, 
Skills, and enthusiasm with which he arrives at school—in order to help him define and organize his responses " 
sounds great in a one-to-one or one-to-small-group ratio. The fact is that this knowledge is neither available 
about each individual child nor are teachers trained to be able to interpret it, Are we, therefore, left with the 
authors’ “© normal ” description of stages of development which are not applicable to each individual child ? In 


fact, is it not the provision of a wide variety and range of apparatus which lets children show us what their needs, 
abilities, and interests are? 


The discussion of literature provides an interesting paradox between the personal impression and reflections 
the authors’ emphasize and their didacticism. To say, ** Wanda Gag’s Millions of Cats is particularly absorbing 
to those children who have just developed the notion of one-to-one correspondence, " puts this children's classic 
in a totally different context. That the** Ugly Duckling ” is appreciated and enjoyed by children who have developed 
** conservation and reversability " is ridiculous when, in fact, this story may be a better example of having pro- 
vided some children an opportunity to experience and begin to understand these two operations for the first time. 
The statement that ** The House That Jack Built” does not demand ‘ conservation of thought in the listeners" be- 
cause of its repetitive nature raises some question as to whether the authors' fully understand the meaning of Pia- 
get’s terms. 


Self concept is not dealt with systematically and this important idea continues to be left to chance even after 
a lengthy discussion. 


Finally, the distinction is never made between children’s incidental interests and the right and duty of the 
school to offer competing stimuli so that some directed development will take place. The emerging curriculum 
is only as good as the motivation behind it. Where that is shallow, other priorities and techniques for involving 
children in these priorities need to be brought into play. American schools need to look at the best of the British 
Infant Schools, but a wholesale transplant will only be effective when the purposes of education and the most effec- 
tive ways of implementing them are in agreement on both sides of the water. 


Oh, Piaget, if only you had developed a theory of teaching, as well as а theory of learning, we 


to sort through these attempts to justify personal points of view. would not have 


Harlan S. Hansen, Reviewer 
University of Minnesota 
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CONFIGURATION AS A CUE IN THE WORD | 
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ABSTRACT 


ONE of the problems of Continuing interest to 


с : In Davidson's (1) investi ation, which suggested 
researchers in the area of beginning reading has been that geometric shape is ылы used cue the ех- 
the Eide of ane are 4551 н visually 415- perimental words included more than one cue at à 

criminating one word from another, wo cues which time. For example, th a art 

have been consider ed important are shar ed letters aca —— A imple, he author stated, ** When party 
similar word shape or configuration 


еа pretty the cause was laid to sim- 

3 re- on, but not so when partyand pla 
pee of ae were confused’? (1:221 ). Were the Ss in rene cd 

a estigations ev- confusing party and pretty becau f similar shape: 
in and Watson (3), Levin, Watson, and Feldman (4); because of identical ise em laii two jr or bed 
and Marchbanks and Levin (5) suggest that the first cause of these cues in combination? , 

and last letters of words are the most critical cues а 
іп word recognition, other studies (1,2, 7), empha- 
size the importance of word shape. 


search findings on the relative im 
cues have been equivocal. While i 


In an attempt to provide a meth idge be- 
tweenthe Davidso ethodological br idg: 


n (1) and Marchbanks and Levin (5) 


Pera the pr esent investigation included not only а 
1; H і 4 А 

Опе of the most frequently cited recent studies ánd Pini Де relative saliency of identical letters 
on this topic has been that of Marchbanks and Levin 


(5). Kindergarten and first-grade children were giy- 
en a delayed recognition matching—to— 
containing trigram and quin-gram disc 


: ape, butalsothe main effect of these 
Variables and their interactions on wordrecognition. f 


sample task METHOD 
riminations.On 


each presentation, the child was required to choose Subjects ects 

one of four response alternatives. Each response con- T de 
1 1 1 ; hi рга! 

tained only опе cue which was like that found inthe children from Gela гапдолцу Selected вте COM 

sample. These cues were identical letters (first,mid- lumbia, Ther istrict 61, Victoria Bri agir! 

dle, or last) or similar shape. The Ss most often in the sample. Gg өліні number of boys dire d- 

confused words with identical first letters, andnext ing instruction w. nding produced by form. gtudY 

most often confused those with the same last letters, during the first as minimized by conducting the 

Similar shape produced the fewest errors. Thus, it inst few weeks of instruction 

appears that when viewing these cues 


in isolation, 
produce more 
5 shar ed shape, 


identical first and identical last letters 


Materials 
confusion errors between words than doe. 


1 е 
The trigrams used inthis investigation wer 


TIMKO 


English letter nonsense words. The 1/4-inch letters 
were produced by a Primary typewriter. Long letters 
such as b,f,y, and p were extended to 1/2-inchtode- 


$*.  velop configurational differences in the words. Тһе 


b. 


sample stimuli and response alternatives were typed 
on booklet—bound 5x8- inch cards. 


A completely counterbalanced design employing 
the dimensions, identical letters (initial, terminal, 
and none), and shape (same and different) was used. 
Each response card contained three of the possible 
six response conditions. For example, ií the sample 
was xrg,the response choices may be mug (identical 
terminal letter, same shape), емі (no identical let- 
ters, same shape) and xds ( identical initial letter, 
different shape). Each response condition occurred 
an equal number of times in combination with alloth- 
ers and each occurred twenty times throughout the 
forty trials. The various response conditions were 
randomly positioned on the response cards. 


Procedure 


The discrimination task included forty matching— 
to—sample delayed recognition trials. Each trial in- 
. volved exposing a sample card on which there was a 
X. centered trigram. After looking at the sample stim- 
“ulus for 5 seconds, the card was turned to revealver- 
tically positioned response alternatives. The S was in- 
structed to put his finger on the word that looked to 
him most like the sample. Duringthefirstfivetrials, 
following eachresponse, the experimenter provided 
verbal encouragement such as: ** Right, Good, Fine. ” 
No verbalfeedback was giventhereafter. When the S 
pointed to morethan oneresponseterm ina giventrial, 


"^ only the first choice was recorded. 


| 


RESULTS 


The number of choices given to each response con- 
dition is presented in Table 1. A 2X2X3 repeated mea- 


sures design (6) with sex, shape, and identical letters 
as factors was conducted. There was no significant 
variation attributable to sex or word shape, nor were 
“there any significant interaction effects. A signifi- 
cant main effect (p<.001) was found on the identical 
letter factor. 


The Newman-Keuls comparison of means in- 
dicated that response alternatives with first letters 
identical to those in the samples were chosen more 
often (р<.05) than responses with identical last let- 
ters, The first letter and last letter conditions were 
each picked significantly more often(p<.01) than 

ы responses containing no letters identical to those in 
the sample. 


DISCUSSION 


The results of this investigation largely support 
the findings of Marchbanks and Levin(5). As in their 
study, the first letter in the word seems to be utilized 
more often by beginning readers than any other cue. 
And again, this tendency appears to be consistent 
across sexes. In addition, this study suggests that 
word shape, when studied as a main effect or as a 
possible contributor to interaction differences, is a 
relatively weak cue in this type of recognition task. 
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TABLE 1 
CONFUSION ERRORS ON EACH RESPONSE 
CONDITION 
Identical Letters 

Initial Terminal None 
Same Shape 
Boys 156 153 89 
Girls 163 152 En 
TOTAL 319 305 183 
Diíferent Shape 
Boys 182 129 92 
Girls 166 144 80 
TOTAL 348 273 172 


It appears that prior to formal reading instruc- 
tion, children are attending more to the features of 
letters in words than they are to total word shape. 
This finding is of particular relevance to the prac- 
tice of geometric and word—shape training in pre— 
reading programs, Since children exhibit a strong 
tendency to focus on the individual letters of words 
they are to learn, does training to attend to another 
cue such as word shape facilitate or inhibit their dis- 
crimination learning? Additional research оп the 
transfer value of configuration training seems to be 
in order. Ы 
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TEACHING OBJECTIVES, STYLE, AND EFFECT 


WITH THE CASE METHOD IN ENGINEERING 


KARL H. VESPER 
University of Washington 


ABSTRACT 


This article seeks to trace re 
ceived by students in an engineerin, 
tion. A checklist was used. 


Experimental data for 
the first year graduate со 
Mechanical Engineering" 
California, Berkeley, in 
offered once a year since 


this study were gatheredin 
urse, "Case Studies in 
offered at the University of 
1968. Thiscourse hasbeen 


1965 under the direction of 
Professor Robert Steidel. It is taught by four people, 


each conducting approximately 2 weeks of it, The 
same four people taught it from 1965-1968. The cat- 
alogue description of the 1968 course was: 


This course will introduce the c. 
and the case study as a means of experienc- 
ing mechanical design. Four or five selected 
case studies will be reviewed and discussed, 
These will involve the thermal control of a 
spacecraft, the machine design of an oil well 
drilling mechanism, a bearing problem, the 
control of a high speed centrifuge, and the 
structural design of a solar panel, Each will 
be critically analyzed to study the design pro- 


cess and the engineering decisions that were 
involved. 


ase history 


This course is open to all graduate students in 
Mechanical Design, but it is required of all 
candidates for the Master of Engineering de- 
gree who expect to undertake an engineering 
case study in lieu of a design project orthesis 
in the Spring Quarter, 1968. 


Five cases were used in the 1968 
each of the first three instructorsand 
The first one used was a case probl 
case histories. The case problem 
student into the position of an engin 


course, one by 
two by the fourth. 
em and the rest 
Seeks to put the 

eer facing an un- 


JAMES L. ADAMS 
Stanford University 


objectives, teaching style, and emphasis as рег” 
rent instructors using the case method of instruc- 
ives of the instructors and effects perceived by the 
ences about teaching style. 


solved problem, while the case history describes both 
the problem and its outcome so that the student can 
view both in retrospect (for further description see 
reference 1). АП cases described real situations 
taken from industrial settings. 


No requirements were imposed upon the individual 
teachers as to what cases they used or how they use! 


them except for the general time limitations of t he 
course itself, 


TEACHING OBJECTIVES 


For gathering data regarding apparent intrinsic aim 


of the course a questionnaire in the form of a “Teach” 
ing Objective Checklist"! 


á riety of possible teaching objec” 
tives. A copy of this checklist was given to each in- 
structor at the beginning and again at the end of hi$ 
Section of the course and to each student at the endo 2 
each section. Instructors and students were reques 


ed to indicate on the checklist the relative emphasis 
Оп each objective, 


Before-and-after estimates by each of the instru? 


Teacher 2 did not co ™7 


four instructors. First of all, the variance between 
the before-and-after estimates was compared to the 
variance between the ‘‘after’’ emphasis estimates 
for each teacher. Applying an F-test to these vari- 
ances indicated that for each teacher (excepting num- 
ber 2, who did not complete the checklist) there were 
significantly higher (at the .05 level) variations in 
indicated emphasis between checklist items than be- 
tween before-and-after estimates. If the before-and- 
after differences are considered as ‘‘noise’’ and the 
differences in emphasis are considered as "signal," 
the signal-to-noise ratio for the instructors is as fol- 


| 
lows: (the threshold of statistical significance is 1.84) 
| Instructor Signal; Noise 
1 3. 78 
» 2 ен 
3 1. 93 
а 


4 4. 82 


Another analysis of variance test was performed 
on the ‘‘after’’ estimates of the various teachers in 
order to check for similarity of emphasis estimates 
between teachers. Тһе F ratio (1. 026) did not reach 
the threshold of significance (1.65). This testthere- 
fore did not show statistically significant agreement 
of estimated objectives between teachers. It is nev- 

_ ertheless of interest to note the items which drew the 
strongest indications of emphasis and non-emphasis. 
These were as follows: 


M Mean Rating 


J 


Ч | Indicated Highest Emphasis 


4.3 Spotting key facts amid 
less relevant data 


Checklist Item 


4.0 In prescribing action to 
be more specific 


4.0 Identifying and defining 
| practical problems 


4.0 Knowledge of what en- 
gineers do and how 
they work 


| 
1 Indicated Lowest Emphasis 


* 1.5 Emphasis on reasoning 
y quantitatively 
2.0 Formulating idealized 
3 mathematical models 
У 
2.0 Manipulating and solv- 
ing mathematical 
models 
2.0 Aesthetic sensitivity 


. A Although these results are not statistically signif- 
) icant (only three professors were polled) they show 
considerable agreement with statistically significant 
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results described elsewhere (3) which were obtain- 
ed with a sample of thirty professors. 


TEACHING STYLES 


The four instructors in this course differed in 
homework assignments, conduct of class meetings, 
and type of questions asked. Homework assignments 
were roughly as follows: 


Teacher Assignment 

1 Take up the engineer’s job at the end 
of the case and carry it further, 

2 Describe how you would follow or de- 
part from the method of operation 
of the case engineer, 

3 Assume the role of a project engineer 
who is not mentioned in the case and 
formulate plans. 

4 Map the main decisions and events in 


the case in the form of a flowchart. 


All teachers used discussion rather than lectures. 
However, from tape recordings it was found that 
classroom style varied measurably among the teach- 
ers. Table 2 shows pertinent quantities for each 
teacher during a sample class period (the second 15 
minutes of discussion ). 


Table 3 classifies the statements of the teachers 
into seven categories, the first four of which ask the 
students to comment. 


STUDENT RATINGS OF EMPHASIS 


Table 3 shows the average relative rankings giv- 
en to each objective by the students. Number 1 rep- 
resents the objective most emphasized in the opinion 
of the students and number 30 represents the one 
least emphasized. A Q-test for differences between 
many pairs of means was used to test differences 
between higher and lower ranked objectives. The 
result was that the four highest ranked items for 
each teacher has statistically significantly ( at the 
-051 level) higher mean ratings than the four lowest. 
Objectives ranked closer to each other could not be 
ordered with this confidence. Itemsthus significant- 
ly separated have been marked with an “H” for high 
and ап “І? for low in Table 1. 


RE LATIONSHIPS BETWEEN OBJECTIVES, STYLE, 
AND PERCEIVED EMPHASIS 


It is of interest to infer a few relationships be- 
tween teaching objectives as rated by the instructors 
teaching style as measured from tape recorder data, 
and emphasis as perceived by the students, It is not 
possible to draw rigorous conclusions because of the 
many variables (personality, organization) not cov- 
ered by the testing techniques and because of the 
small sample of teachers. However, inferred rela- 
tionships are of interest because they demonstrate 
the type of conclusions which can be drawn fromthis 
type of testing. Such relationships are useful not 
only to the teacher who wishes to evaluate his tech- 
niques or reorient his Course, but also to those 
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TABLE 2 
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V URSE* 
STATISTICS ON SECOND FIFTEEN MINUTES OF CLASS DISCUSSIONS IN BERKELEY CASE CO 


Teacher 1 Teacher 2 Teacher 3 Teacher 4 
1 2 
Session 1 2 1 2 1: 2 
33 25 
Number of Interchanges** 45 40 21 36 33 25 
y y ‹ 0r 
Time Teacher had Floor 44% 29% 37% 34% 720 80% 44% 33% 
» у у o cz 
Time Students had Floor 56% 11% 63% 66% 28% 20% 56% 67% 
Teacher Average Time 8 шй sis na 
per Interchange (seconds) 8.3 5.4 14.2 Т5 18 А is | 
Student Average Time рег | | 
Interchange (seconds) 10.7 15.3 24.6 14.9 ү А 6.6 14.3 22.4 
{ 
Total Words by Teacher 906 839 591 551 1508 2020 604 428 


* Based on tape recordings of the discussions. 


** Occasional brief interjections such as “yes,” “ОК,” “I see" were not counted. 


TABLE 3 


TYPES OF COMMENTS BY TEACHERS IN CLASS DISC USSIONS OF BERKELEY CASE COURSE 


Type of Statement by Teacher* 


Fraction of Teacher’s Total Statements 


Teacher 4 | 


Teacher 1 Teacher 2 Teacher 3 

l. Asked what engineer(s) in case should do next 16% 0% 6% | 
2. Asked student to clarify or justify something 

student had said 31% 28% 0% 
3. Requested specific facts from сазе % 5% 1% 
4, Asked for student appraisal of what engineers 59% 57% 22% 

in the case had done 9% 24% 9% 
5. Summarized or reiterated student comments 20% 19% 21% 16% 
6. Gave directions 0% 10% 15% 30% 
7. Stated own opinion or rhetorical description u% 14% 42% 34% 

TOTALS 100% 100% 100% 100% 


* Classified subjectively by the experimenter based on tape recordings of the class di 
Ss discussions, 


seeking to design new courses or curricula which in- 
clude case studies. 


During the planning of the course, Professor Stei- 
del attempted to schedule the four teachers so as to 

begin with the most nondirective and increase the de- 
gree of directiveness as the term progressed, There 
are several indications that he succeeded. The most 


notable are the 
to total stateme. 
tions to total st 
tions for teachi 
20% tor 4), 


what the eng 
This is cons 


increasing ratio of directions give 
nts and the decreasing ratio of quet 
atements seen in Table 3(69/6que 
m 1, 51% questions for 2, 22% e 
T ooking at ion aske 
it can be seen that е ple Ren aske 

ineer(s) in the case should do next: 
istent with his indicated emphasis 0" 
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TABLE 4 


RANKS OF STUDENT PERCEIVED EMPHASIS ON OBJECTIVES OF EACH BERKELEY CASE COURSE TE ACHER* 


Teacher 
1 2 3 4 
Т T 
I. HABITS - Emphasis on increasing your tendency to 
1 fF = 
1. Reason quantitatively (use numbers ) whenever possible | 24 p 22 27 
1 n m] 18 
2. Discriminate between fact and opinion | 17 | 11 18 21 
- = 
3. Search for more alternative solutions 2 3 4 7 
- L r | 
4. In prescribing action be more specific | 4 20 16 
21 
5. Pay meticulous attention to detail Б; 22 19 17 
25 17 | 22 
6. Think more carefully before Speaking |j 8 26 18 23 
IL SKILLS - Emphasis on developing your ability of 
H — Ix—— 
25 13 18 
l. Using unfamiliar exercise tools or exercise methods 23 26 14 21 
3 | 16 1] 
2, Communicating (writing or speaking or drawing) үй 41 17 24 13 
3. Identifying and defining practical problems 10 1 1 2 
I i 
4. Spotting key facts amid less relevant data 21 9 3 1 
—L— 
5 ғ | 8] jo 
5. Foreseeing consequences of alternative actions 6 20 6 3 
6. Formulating idealized math models of real problems 29 27 25 29 
ER в ү 
7. Manipulating and Solving idealized math models 30 30 27 30 
is | 
8. Viewing problems and complications with perspective 11 | 10 9 
= [3 
9. Selling ideas or arguing more persuasively 4 24 30 25 
Е I^ == 
Ill. KNOWLEDGE - Emphasis on imprinting on your memory 
| 18 Il 10 
| l. Historical episodes as lessons 21 19 12 11 
| T 4 
2. What engineers do and how they work (typical activities ) 14 2 2 4 
+= [774 I— —| 
3. Criteria for judging engineering procedures 13 1 9 6 
[- | m 5 
4. Criteria for evaluating designs 17 7 | 6 8 
i.i 
5. Mathematical, Scientific, or engineering theory Т 28 15 28 
" Ж М " " 18 
6. Technical facts used in engineering (besides theory) 12 | 9 1 21 
" 21 
7. Formats or class procedures unique to the course |? 22 29 26 
бы 
(continued on following page ) -L 
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TABLE 4 (Continued from previous расе) 
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IV. ATTITUDES AND FEELINGS - Lifting your level of 


1. Self confidence 


2. Perseverance in spite of setbacks 


3. Concern with questions unanswered for you yet 


4. Enthusiasm, motivation for course or engineering 


5. Aesthetic sensitivity 


26 29 23 24 
Loa 
| 5 | 18 
6. Interpersonal sensitivity (feel for people ) 17 
т. 


Self knowledge (own abilities and limitations, etc. ) 


Tolerance for other's ideas and errors 


8 14 155 13 


* Two rank numbers appearing for a given item indicate ranges of ties in rank 


n that instruc- 
pinion or rhetorical 

es that instructor 3 

the other instructors 
Pposed to approximately 
aracteristics of style are 
tudent complaints such 

5 to be too Opinionated,” 
without exploring class.” 


tor 3 most often ‘stated own 0] 
description.” Table 2 indicat 
held the floor much more than 
(about 75% of the time, as o 
35% of the time). These ch 
consistent with unsolicited si 
as “‘talks too much, ” “tend 
“answers his own questions 


Instructor 1 used a case problem in which each 
chapter ended in a clear problem situati 
a particular engineer, Instructors 
case histories. These choi 


them specifically, Similarly, 
not rate teacher 1 high on ** 


they work” because his Case was more problem ori- 
ented than description oriented, 


Teacher 3 was the Only one who was not rated low- 


theoret- 
The Previously 
О more of the 

w emphasis on 


| 


teaching students to 
Sell ideas” 
the high rating of teacher 1 perceived here is consis 
tent with his high rating of ** 
ing ideas” and with his ten 
frequently to explain and 


‘argue more persuasively oF M 
às perceived by his students, Similar 


communicating? ала өсі 
dency to ask students mo! 
justify their statements: 
The concurrence of teacher objective emphasis 
and student perceived emphasis Varied from teach" 4 
er to teacher. Considering the six Significant St! m 
dent ratings (marked in Table 1), itcanbe seen th? 


Teacher 
1 2 3 x 
| | 

5 13 26 5 

6 14 27 15 

18 | 11 

21 19 21 13 

= - 22 
| 
| з) | | 
Ў | 
viously discussed « 
1 agreed on only t 
none out of six, 


“selling ideas” 
end.) Teacher 


(He put one at the top of the scak | 
at the opposi^ | 


Several items checke ^ 

id ; emes of the scale can be 9067 | 
“mi : 5 

Sidered as "^, Since the checklist was n 0 

Sensitive enough to det igni 


list been truncated by re moving the highest conten 
items апа reapplied, enough further significant se 
trasts might have emerged to turn some of the ail” 
misses" into “hits, 5 However, no time was et) Г 
able to do this. In the opinion of the investigator id 
haere Significant line of further investigation ере, 
have been an attempt by the instructors to use ter" 
checklist to “steer” toward different rating ра 
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as an indication of their degree of control over the 
teaching process with cases. 
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THE RELATIONSHIP OF ACHIEVEMENT 
RESPONSIBILITY TO INSTRUCTIONAL | 


TREATMENTS 


KINNARD WHITE 
University of North Carolina, Chapel Hill 


ABSTRACT 


` 


The hypothesis that students who are charact 


erized as having external versus internal control would achieve 
differently under different classroom instructional treatments was tested. Thirty-two seventh-grade boys (six 
teen externals and sixteen internals) were randomly assigned to two instructional treatments in 
treatments varied according to the amount of control exercised by the teacher, The role-assumption treatment 


RECENT research designed to increase our 
knowledge of the relation of individual differences to 
the learning of school children has demonstrated the 
importance of the variable internal-external control 
of reinforcement (8,11). Rotter's (12) social learn- 
ing theory postulates that theoccurrenceof abehavior 
in a particular circumstance is a function of the in- 
dividual’s expectancy that the behavior will result in 
reinforcement. The control construct in Rotter’s the- 
ory is considered to be a generalized expectancy, op- 
erating across a large number of situations, relating 
to whether the individual accepts responsibility for 
what happens to him.Persons characterized as having 
internal control are considered to perceive eventsas 
being a consequence of their own actions and there- 
fore in some sense under their personal control. Per- 
sons characterized as having external control ar e 
considered to perceive events as being unrelatedto 
their own behaviors and, consequently, beyond their 
personal control. 


Crandall, Katkovsky, and Crandall(2) have de- 
veloped a scale (Intellectual Achievement Responsi- 
bility-IAR ) to measure children’s beliefs inrespon- 
sibility for reinforcement. This scale is primarily 
devoted to school-related situations. Rather than focus 
оп impersonal social forces, the IAR scale has limited 
the sources of external control to persons suchas teach- 
ers, parents, andage mates, who are most likely to 
come into contact with the school-aged child. 


| 
JAMES LEE HOWARD атеш 
North Carolina Advancement School, Winston-Sale | 


science. The 


Recent research on learning in laboratory settings 
using internal versus external control of reinforce” 
ment as a variable has shown that the internal perso” 


ў я ing 
learns more rapidly, is less variable in his learning? 
generalizes his learning 


more, and retains more (?^ 
Using the IAR scale, Crandall, Katkovsky, and Pres” 
ton (3) found that achievement-related activities wer 
highly related to control among boys. These relatio? ^ 
Ships did not hold for girls. Specifically, boys who? 
tributed responsibility for achievement to themselV?. 


(internals) Spent more time in intellectual free-play? 
demonstrated greater intensity of striving in intelle te 
tual free-play, and scored higher on intelligence te? 


reading achievement tests, and arithmetic achieve” 
ment tests. 


| ‚лей 
The evidence indicates that students character ve 
as externals do not 


агас 
є perform as well as those chara 
terized as internals 


А when exposed to convention® 
school learning situations, 


‚_ Jackson’s (7) analysis of this situation has led 
him to suggest that the teaching-learning situation? 


B 5 
in the schools could be reconstructed so that student 
characterized 


ie" 
3 às externals would be forced to V1 "ch 
the teaching-learning environment not as one in hu 
the student is performing in certain prescribed W3, 
designed to elicit reinforcements from t he teach? 
5 the external Student is placed in a situation in WI А 
is task is to explore an area structured by hims? 
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then more positive results could be expected. Rogers 
(10) and Maslow (9) have also argued from motiva- 
tional theory that the child must develop toward au- 
tonomy and away from external control. 


Current trends in teaching and curriculum have 
emphasized both the individualization of instruction 
and the use of inquiry methods to provide for differ- 
ential rates of learning (13). Actual attempts to im- 
plement such methods have met with varying degrees 
of success. Thelen (15) has suggested that one like- 
ly source of difficulty is the lack of knowledge of how 
teaching methods and student characteristics interact. 
Although theory has emphasized the nurturing of pro- 
ductive thinking through the design of learning expe- 
riences, application has lagged (6,18). Furthermore, 
very rarely have researchers applied motivational 


theory to the design and conduct of instructional 
programs. 


Bettelheim (1) has argued that the goals of en- 
hancing psychological health and effective functioning 
are primary to the learning process. In addition, the 
Skill of emotional management is dependent upon the 
development of autonomy and inner freedom within 
the individual. In developing his argument, Bettel- 
heim emphasizes the following points: 


First, the school must help the child overcome 
his lack of identity. This may best be accomplished 


by allowing the child to faceconflicts rather than 
avoid them. 


Second, teachers must develop the skill of self- 
direction in children by allowing them to experience 
life first- hand. This will not be accomplished by pre- 
packaged explanations of what life (or school subject 
matter )is all about from the viewpoint of the teacher. 
Furthermore, the child should be encouraged to ex- 
press his own point of view. 


Third, the school must provide opportunities for 
the individual to manage his anxieties and to under- 
Stand his own emotional reactions. 


Fourth, the school should allow a child to en- 
Counter potentially dangerous external (failing) sit- 
uations directly and without condemnation,rather than 
hiding the danger or purposefully reinforcing the fail- 
ure externally. In this way, the child learns inter - 
nally to identify the potential danger in a situation and 
is allowed to learn from his failures rather than to 
avoid or dismiss them. 


The theoretical views on school curriculum and 
teaching expressed by Bettelheim, Jackson, Hullfish, 
Siegel, Smith, and Thelen echo the argument put for- 
Ward by Cronbach (4). Cronbach's argument was that 
hot only are treatments characterized by many dimen- 
Sions, but so also are persons. Consequently, we must 
attempt to deal with treatments and persons simulta- 
neously. In doing this, we must attempt to design 
treatments to fit groups of students with particular 
personality characteristics. 


ent research reported here was designed totest 
ionis ypotheses derived from the theory and empir- 
. “esearch data reviewed above. The hypothesis of 
Бау interest was that the perceptions of male 
Students with regard to internal versus external con- 


trol of reinforcement in intellectual achievement sit- 
uations would interact with instructional method in 
affecting achievement. Specifically, the following in- 
teraction hypothesis was tested: Achievement for ex- 
ternals would be superior under a method of instruc- 
tion that stressed learner-directed activities as 
compared to a method of instruction that stressed 
teacher-directed activities: achievement for inter- 
nals would not be affected by these differential in- 
structional methods. In addition to the interaction 
hypothesis, which was of major interest, the follow- 
ing hypotheses were tested: (a) internals willachieve 
more than externals, and (b) students in the learn- 
er-directed group will achieve more than students in 
the teacher-directed group. 


METHOD 


The Ss for this study were thirty-two boys enrolled in 
the fall 1968 term of the North Carolina Advancement 
School. AllSs had elected to take a general science 
course designed for seventh-grade students. The Ad- 
vancement School is a residential school designed to 
conduct research on discrepant achievement. It is 
maintained by the state and selects students from 
public schools throughout the state. To be eligible,a 
boy must have average or above average ability and 
be achieving two or more grade levels below the 
school grade in which he is currently placed. Bothof 
these criteria, ability and achievement, are deter- 
mined by standardized tests. Most students are per- 
mitted to remain at the school for only one semester 
(16 weeks). All Ss were in the seventh grade and 


were enrolled in the Advancement School for the first 
time. 


Design 


A 2X2 factorial design with a pretest and posttest 
was used for the study. There were two levels of in- 
structional method—role-assumption and structured 
class; reinforcement control consisted of two levels, 
internal and external. 


Procedure 


Assigned Variables. The Ss were categorized as 
being externals or internals on the basis of their 
scores on the IAR scale (2), This scale was rou- 
tinely administered to all boys entering the Advance- 
ment School as a part of the testing program during 
the first 2 days in residency. The median score on 
the IAR scale for those who elected to take the sci- 
ence course was used to establish categories of in- 
ternals and externals (median for the total group was 
25). Those above the grand median were considered to 
be internals (median оп ТАН for this group was 28), and 
those below the grand median were considered to be 
externals (median on IAR for this group was 21). 


Manipulated Variables. The treatment which stress- 
ed learner-directed activities was called a role-as- 


sumption treatment. The sixteen students assigned 
to this treatment were told that they were toassume 
the role of a scientist and could spend their time dur- 
ing the semester studying anything they desired as 
long as it was related to science. The role of the 
teacher was one of helping the student to secure in- 
formation and materials for experiments or other ex- 
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i in which the student was interested. Stu - 
p is treatment pertopmeditw o or mor 5. 
experiments during the semester. hese шше is 
included such things as weather prediction, t eet es 
of types of light upon plant growth, genetics, n 
tion, and animal care. 


eatment which stressed teacher-directed 
T crm called the structured treatment. The 
sixteen students assigned to this treatment weregiv- 
en considerable direction as to the kinds of experi- 
ments likely to be of value in their learning. They 
werealso given specific reading assignments for 
which they were held responsible. Except for grades, 
which were not given to students in either group, stu- 
dents in the structured group were, in every way pos- 
sible, ledto view the teacher as being іп control of the 
learning situations and passing out reinforcements 
when they were justified. 


The same two teachers taught both classes. Students 
were assigned to the treatments randomly, with eight 
students in each cell of the 4-cell design. 


Criterion and Covariates. 'The criterion was 
achievement as measured by the Sequential Test of 
Educational Progress (STEP): Science (5).This test 
was selected as the criterion measure for three rea- 
sons. First and most importantl 
measure the skills in science t 
instructional objectives of the 
to both treatment groups for thi 
measured by this test are; 
lems, developing hypothes 
dures, interpreting data, 


y, it was designedto 
hat were the primary 
science course taught 
is research. The skills 
identifying scientific prob- 
es, selecting valid proce- 
critically evaluating claims, 
and quantitative and symbolic reasoning, Secondly, 
the content areas of this test are approximately equal- 
ly divided into biological, earth (including astronomy 
and meteorology),and physical science, This, too, re- 
flected the content range of the course, Thirdly,a 
standardized test was desired, ы 


Two control variables were us 
exercised by means of the analysis 
telligence (14) and the pretestSTE 
ment were used as covariates. 


ed, control being 
of covariance. In- 
P-Science achieve- 


RESULTS 


The means and standard deviations onthe covariates 
(IQ and pretest achievement) and the means, standard 
deviations, and adjusted mean: 


SÍor the criterion 
(achievement) may be observed in Table 1, 
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The results of the analysis of covariance or E | 
posttest achievement scores with IQ and рр 6524 
achievement as covariates may be observed in Та! 1 


TABLE 2 


Ч TTEST 
ANALYSIS OF COVARIANCE FOR POS T 
ACHIEVEMENT WITH PRETEST ACHIEVEMENT 
AND IQ AS COVARIATES 


Source df ms f 
Treatment (Т) 1 112.74 6. 72* 
Responsibility (В) 1 0.16 0.01 И 
TXR 1 109. 07 6. 90* 
Regression 2 13,03 4.35 
Error 26 16.78 


* p02 

The hypothesis predicting an interaction between 
instructional method and internal- external type was 
upheld by the results of this analysis, The шау 
tion was significant (F=6. 20, df=1, 26, p= .02)ап 


1 
in the predicted direction, The type of interaction ob- 
tained is shown in Figure 1. 

{ 


The level of achievement by students who wer s 
classified as internals was not affected by the Hier 
od of instruction (Xaj role-assumption was 24. 95 


X 


i t 
ajf Structured was 24. 89). However ,achievemen 
adj 


by students who were classified as externals was quite 
clearly affected by the method of instruction Ga for 


role—assumption was 28, 62; Ж for structured was 


20.54). This was predicted by the hypothesis: inte! a 
nals did equally well in both instructional methods PF 
sumably because they brought to the learning situation, 
regardless of how it was structur ed, the belief that the: 


wereresponsible for their reinforcements, Howevel’ 
the achievement of ext 

learnin 
ble for 


which forced the stu 
ing situation, incont 
Whichthe student 
Which the teacher 
inforcements, resulted in Superior achievement 20” 
this type of st 


udent, 
TABLE 1 
MEANS AND STANDARD DEVIATIONS FOR COVARIATES AND ME A 
JUSTED MEANS FOR EACH CELL (N=8) IN THE EXPERIMENTAL DES SA NDARD DEVIATIONS, AND AD 
INTERNALS 
Role Structure Role EXTERNALS " icm 
= = Struc 
i 5 x » ан x g = i M 
Variable aaj — 3À E X aj 5 X Z ajs X Хау 
IQ 10.30 96.12 9.20 102.88 16.31 104.00 " 
` ы 10.61 96. 
Ach (Pre) 4.75 19.62 3.33 19.25 5 
«68 22.38 4.31 20.50 
Ach (Post) 4.59 23.25 24.57 4.64 24.88 24.89| 4.78 30 00 28.62 m 45:52 
à Р . 4.21 20. 
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FIGURE 1 


GRAPH OF REINFORCEMENT TYPE BY TREAT- 
59+ MENT INTERACTION (F=6. 50,df=1,26, р<.02) 


30 
28 
26 
Adjusted 24 - 
22 % 


20 % 


Internals Externals 


————— Role-Assumption 


Ete Structured 


There was a main effect for the treatment factor 
(Ғ-6.72, df - 1,26, p^ . 02). Students assigned to the 
role-assumption treatment group achieved much bet- 
ter than those assigned tothe structured treatment 
group. The adjusted mean for posttest achievement 
for those inthe role-assumption treatment was 26. 60, 
and 22, 72 for students inthe structured treatment. 
The hypothesized difference in achievement between 
externals and internals was not supported (Е - 0. 01, 
df = 1, 26, p was nonsignificant). 


! DISCUSSION 


The results clearly indicate that it is possible to 
design treatments to match personality types of stu- 
dents to improve achievement (4). Under thenormal 
School learning situation, the structured treatment, 
external students did not perform well on the science 
achievement test. However, when placed in a method 
of instruction designed especially to fit their person- 
ality type, their achievement was improved to alevel 

| Considerably above that of internal students. Such ге- 

| sults should be encouraging to designers of instruc- 
tional methods and should alert applied psychologists 
to the possibility of using meaningful personality vari- 
ables as a basis for assigning students to classes. 

| The expected difference in achievement between 

| externals and internals did not materialize. On the 

| basis of the available data, this can best be accounted 
for by the exceptionally high level of achievement by 
the externals in the role-assumption treatment. These 
data indicate that, under the ordinary circumstances 

of a structured treatment, the difference in achieve- 

pent between externals and internals would have 

een observed, 


І t The hypothesized difference in achievement be - 
Ween students assigned to the role-assumption treat- 
ment and students assigned to the structured treat- 


ment was supported. This finding can be accounted 
for primarily by the differential achievement of ex- 
ternals in the two treatments. The evidence clearly 
indicates that it would be best to employ the role-as- 
sumption treatment regardless of whether the stu- 
dents are classified as externals or internals. Since 
there was no difference between treatments for in- 
ternal students, but a large difference betweentreat- 
ments for external students favoring therole- assump- 
tion treatment, the decision of which method to use 
has to favor the role-assumption treatment. Thes e 
findings indicate that a knowledge of the type of stu- 
dents involved can be an important variable to con- 
sider. If a large proportion of the students are exter- 
nals, as is the case at the North Carolina Advance- 
ment School, then it would be of particular importance 
to implement a role-assumption treatment. 


These results provide some support for the points 
of view expressed by Thelen (16,17) and Siegel (13), 
Thelen has placed emphasis upon the development of 
process skills and the use of inquiry methods to de- 
velop stable and transferable knowledge. The role- 
assumption treatment used in this research may be 
broadly classified as a type of inquiry approach. Sie- 
gel has recently emphasized that teachers must shift 
from didactic teaching methods to the more complex 
dialectic teaching processes. The results of this re- 
search support this concept. 
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A Gulde for Preschool Teachers 
In Head Start-Type Programs о! 
Compensatory Education 


EDITED BY 
Robert E. Clasen 


О N TO THE CLASSROOM deals 
with typical problems common to 


teachers of disadvantaged preschool 
children and contains unique suggestions 
for understanding and meeting the needs 
of these youngsters. The chapters are 
based on papers by well-qualified pro- 
fessors ES rofessionals from the 
preschool education field which were 
originally presented to a group of Head 
Start teachers needing help in the various 
areas covered. The editor says, "Since 
these works were extremely useful to one 
group of teachers, they should be 
useful to others.” 


The book begins with a chapter which 
defines “culturally deprived” and offers a 
frame of reference for the thoughts and 
ideas presented in the remainder of 

the book, Each chapter was selected 

by Dr. Clasen on one criterion: 

Does it contain information which our 
experience has shown that teachers need? 
The chapters speak for themselves: 
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200 pages 


Creating a Learning Environment (numerous hints 
how this learning climate can be created) 


The Teacher, The Child and Head St 


a teacher's awareness are discusse 


Speech Language Acquisition and Language and Head Start (deal 
with language diagnosis and te 


From a Teacher's Point of Vi 
to day account of organizing, canvassin 
ming in Head Start, plus 
classroom from the first cl 
teacher's log with 


A Conversation with a Head St: 
the mother of a Head Start 


Programming for Parents 
is all about) (Chapter 8) 


A Statement by Dr. Cl 
ON TO THE ‘CLASSE 
that an idea shared th 
in a teacher's behavio 
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are given on 


(Chapter 2) 


art (the needs of children and 
d) (Chapter 3) 


aching strategies) (Chapters 4, 5) 


iew (a humorous and heart-rending day 


nd parent program- 
the happenings in a Head Start 

ass day to the last—all taken from a 
her commentary and suggestions) (Chapter 6) 


art (A.D.C.) Mother (reveals what 
child experiences) (Ch 


apter 7) | 
(offers surprising views on what this 


asen summarizes the real purpose of 
ROOM: “The fondest hope of each of us is | 
rough this medium 


may stimulate a change 
for the benefit of a child.” Б 
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A REGRESSION APPROACH 


TO EXPERIMENTAL DESIGN 
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The University of North Dakota 


ABSTRACT 


This paper z 
er is fannie nee a regression approach to experimental des 
Sion for орта е usual analysis of variance techniques and is unaw 

esign and ы” ving. The specific approach used in this paper assumes оп 
ег installatior KO a general purpose multiple regression program wi 
1. Examples of the t-test, 1-way analysis of variance, and the tre: 


regresa; 
8ression standpoint are given. 


MA 
have had нё ыр UCATIONAL researchers will 
Ysis of ЖЕЎ {алге avg to several applications of anal- 
OCCurredina ce, and quite commonly this will have 
Some similar title. called “experimental design" or 
nce approach itle. The classical analysis of vari- 
Severa] үй commonly expose the student to 

і age topics, such as the treatment X 
sia ee design, the randomiz- 
riade. split-plot design, and the anal- 


It is 
who have 2 contention of this writer that researchers 
Classica} айт their statistical training in the 
heir dat ach have at least two difficulties that 
amount of dat a analysis procedures. For any large 
adjunct to thee а computer would seem a necessary 
Strong tende ir analysis; however, there may be a 
pH е бш researchers trained in the clas- 
i en using the © rely heavily on stock programs 
9 look for a ate puter, Further, they are likely 
А Second dif iA ent stock program for eachdesign. 
mir Bon tráined У that may be likely to occur isthat 
pi e his design iis the classical approach will try to 
pr ulate the на into an existing designrather than 
blem, lel most appropriate for his unique 


The 
fami, Present di 2 
ane tae = discussion focuses on some designs 
boal/sis б хана наша researcher. The usual 
press as Lindquist | presentation occurs in such text- 
sie entation je ast (7) or Edwards (4). While this 
Ens, it DM oncerned only with well-known de- 
Ped that the reader will be able to at 


ign. Typically, the educational research- 
are of the versatility of multiple regres- 
ly a familiarity withexperimental 
hich is likely to be on hand at any comput- 
atment X subjects design from a 


least start to formulate his research problems so 
that he might be able to submit his data to a comput- 
er and receive answers to specific research ques- 
tions that are of interest. 


A word is in order as to the direction of the dis- 
cussion-only one computer program, the general 
purpose multiple regression program likely to be on 
hand at any modern computer installation, will be 
considered. Thus. a minimum of sophistication is 
necessary regarding programming. A somewhat 
similar approach is used by Kelley and others (6), 
whoemploy programs known as LINEAR and DATRAN. 
Bottenberg and Ward (2) have used a similar ap- 
proach (also using DATRAN) in presessions of the 
American Educational Research Association annual 


conventions. 


Afirst step toward using multiple regression as 
a problem-solving technique would be to formulate 
models for multiple regression for several of the 
more familiar designs. A natural starting point is 


the t-test. 
THE t- TEST 


Consider the following data: 


Group 1 Group 2 
25 25 
24 23 
23 21 
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t test data continued from previous page. 


Group 1 Group 2 
17 
= 2 
21 20 
18 15 
17 14 
16 12 
16 10 
16 10 
15 12 
14 H 
13 10 
13 1 
13 6 
12 6 
11 8 
10 4 
9 1 


When the usualt-test is run, a value of 1. 97 is 
found. Using a regression approach, itis first use- 
ful to define two binary predictors, Xi and Xy: 


X, = lif the score is from a member of Group 1; 
and 0 otherwise, 


Хә = 1 if the score is from a member of Group 2; 
and 0 otherwise. 


A linear model can be written for this situation: 
Ү= 50» biX, *boXo«e 


where: 
Y - the criterion score, 


bo =the Y-intercept, 


b; = the regression coefficient for Xi 4 
bg = the regression coefficient for х, қ 
е = the error involved іп prediction, 

The preceding information can be put in a table 
helpful for preparation of ni c per 
Sum 2 m үсте желді Опе сага сап һе prepared for 
mation regarding gro 


The conceptualizing that usual] takes pl: - 
ing the use of the t-test might ice roue а 
tion, do the means differ Significantly? On the other 
hand, in a regression formulation, the researcher’s 
thought process may involve using the knowledge of 
group membership to predict the criterion scores 
In the final analysis, both approaches use the same 
linear model. 


If the multiple regression program is used with 
the data from Table 1, with the two predictor var- 
iables, Ху and хо, and with the Ү variable as the 
criterion, the probable result is that the program 
will not run and will simply report back something 
like "MATRIX SINGULAR SELECTION IS SKIPPED 
AS REQUESTED. ” For the 


c person unfamiliar with 
the computer, many different reactions can occur 
, 


from disgust to “I knew it couldn't do it. s УШР | 
going into technical details, the problem is one О 4 
supplying too much information. Going back to ont é 
of the predictor variables, say X, either a lor à 

0 is recorded for every criterion measure. The 4 

next column, хо, is simply the reverse of X thal 

is, if there was a 0 in Xp then х, has to be a 1. 

Thus, only one of the predictor variables is neces- 


sary to impart all the information regarding group 
membership. 


TABLE 1 


REGRESSION FORMULATION OF THE t- TEST 


Y Жі = X2 


0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
1 
1 
iu 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 


1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
il 
1. 
1 
1 
al 
1 
0 
0 
0 
0 

22 0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


1 
ee re 


Е, і ion і E 
ше ог the information in Table 1, X, was used ° ei 
Мерит of the criterion variable, using th? 

Pose multiple regression program. 


WILLIAMS 85 


TABLE 2 


MULTIPLE REGRESSION SOLUTION FOR t-TEST SITUATION 


SELECTION 1 


Variable No. Mean SD Correlation Regression Standard Computed Beta 
XvsY Coefficient Error of t Value 
Regression 
Coefficient 
1 0. 50000 0.50637 0.30497 3.55000 1. 19835 1.97403 0. 30497 
Dependent 
3 14. 77500 5. 89431 
Intercept 13.00000 
Multiple Correlation 0.30497 
Standard Error of Estimate 5.68689 


ANALYSIS OF VARIANCE FOR THE REGRESSION 


Source of Variation Degrees of Sum of Mean Squares E Vaio 
Freedom Squares 
Attributable to regression 1 126. 02502 126. 02502 3. 89678 
Deviation from regression 38 1228. 94995 32. 34077 
Total 39 1354. 97485 
3 w intercept is also included. This value is 

36 н and is the mean of the group that was coded 
a n the Xi variable. The multiple correlation, Group 1 Group 2 Group 3 Group 4 
varinha 4097. Since there is only one predictor 4 13 11 10 

th the’ this value is in fact a 0-order correlation. 2 10 9 9 
of Eras Standard error of estimate and an analysis 0 * 7 8 
Drintout, AK 5 n regression are included in the 

ut. Option is available in the multiple re- i i i i i 

gression А с р. The usual analysis of variance for this data is repor 
desired, program for a table of residuals if they are ed in Table 3. 


i Table 2 contai i i 

ing th 'ontains the printout (not includ- 

The Siren residuals) for the data in Table 1. 
Ssion coefficient is for the predictor vari- 


able x . 
we i i 
"sd p have, in effect set b, =Oby this ap- 


ис 
by the pedes парай that the results that were obtained 
8ression ар -test were also found by using the re- 
value (t. URS In addition to the computed t 
equal to d үк ), the Е value of 3.897 is also 
cient is sq. ditionally, if the correlation coeffi- 
ET 
nt of t| 
ed for by тонын 


=. 0925), this indicates that 
riterion variance can be account- 
of group membership. 


1-у 
д ANALYSIS OF VARIANCE 
he Situation f. 


teak СУ simil 
" Onside 


or analysis of variance can be seen 


ar to the regression analysi 
r the following data: Бла 


ТАВІЕ 3 


SUMMARY TABLE FOR THE ANALYSIS OF 
VARIANCE 


Degrees of Sum of Mean F 


Source of 
Freedom Squares Squares 


Variation 


123. 00 41.00 9.11 


Among groups 3 


Within groups 8 36. 00 4.50 


Total 11 159. 00 
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The intent here is to show that the same results can 
also be accomplished using the multiple regression 
program, rather than focusing on the analysis of 
variance per se. 

To conceptualize the problem for a regression 
analysis, it is helpful to define a set of binary pre- 
dictors. Four binary predictors can be defined to 
correspond to the four groups: 


X 1 = 1 if the score isfrom a member of Group 
1; 0 otherwise, 


X = 1 if the score is from a member of Group 
2; 0 otherwise, 


Хз = 1 if the score is from a member of Group 
3; 0 otherwise, 


Ха = 1 if the score is from a member of Group 
4; 0 otherwise. 


A linear model can be written for this situation: 

Y=bo+ Ху + boX» + bgXg + b4X4 +e 
where: 

Y = the criterion score 

bo 7 the Y-intercept 

bi 7 the regression coefficient for X. 

Бә = the regression coefficient for х, 

bg = the regression coefficient for X. 

bq = the regression coefficient for X 


4 
е - the error involved in prediction. 


The information regarding group membershipand 
the criterion scores can be put into a format simi- 
lar to Table 1 and is found in Table 4, 


TABLE 4 


REGRESSION FORMULAT 


ION FOR 1-W 
ANALYSIS OF VARIANCE m 


Y Ay X5 Xs X, 
4 1 0 0 0 
2 1 0 0 0 
0 1 0 0 0 
13 0 1 0 0 
10 0 1 0 0 
% 0 1 0 0 
11 0 0 1 0 
9 0 0 1 0 
1 0 0 1 0 
10 0 0 0 1 
9 0 0 0 1 
8 0 0 0 T 


Again, it can be remembered that the last column 
is actually not adding any new information and сап 
be considered redundant. If by is set equal to 0, the & - 


prediction equation can be rewritten as 
Y = bg + bjX, + bgXo + bgXa + € 


Using this setup, the previous data was analyzed with 
the general purpose multiple regression program. 


Again, the printout will normally include the means, 
standard deviations, correlations with the criterion, 
regression coefficients, standard errors of the re- 
gression coefficients, computed t values, and the beta 
coefficients. The intercept for this data is given as 
9.00. The multiple correlation, R, isequalto.87954, 
and the standard error of estimate is 2. 1213. Inthe © 
analysis of variance for regression, the following 
summary table (Table 5) is routinely printed out 
(there is aiso included in the printout the previously 
mentioned items, but they are omitted here). 


TABLE 5 


ANALYSIS OF VARIANCE FOR REGRESSION 


Source of Degrees of Sum of Mean F 
Variation Freedom Squares Squares 
Attributable to 

Regression 3 123.00 41.00 9.11! 
Deviation from 

Regression 8 36. 00 4,50 

Total 11 159. 00 


The usage of the regression program should ре” 
come apparent. Not only is the data available that 
normally is a part of analysis of variance, but 2180» 
а measure of the amount of variance that can be 267 
counted for by the predictor variables can be found: 
Неге R = . 87954 
77.55 perc. 
ed for by 


and, therefore, n? = , 7155; thus 
ent of the eriterion variance can be account 
knowledge of the group membership. 

One additional advantage is i ВЕ 
student becomes fauna with ee eel approach 
he can more fully understand the logic of both regre? 
Sion and experimental analysis, 


FACTORIAL DESIGNS 


The factorial de 
or fixed effects de 


he same conceptualizati de^ 
ееп the 1-way and Vnus 

5 $ additional variables (fac 

is the inclusion of the i i ings Ô j 
lias dimen sie interaction effect. Jennin£ 


Ё Onceptualized in the гергё®^ 
Sion approach. Rather than duplicate his effort, th 
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reader is encourag ing’ i 
aged to read Jenning's origi- 
nal article. 2 ч 


TREATMENT X SUBJECTS DESIGN 


dee en! X subjects design is described in 

d а n indquist (7), and is a repeated measures 
sign Wherein each S serves as his own control 

Consider the following data: ` 


Subj 
ubject Treatment 1 Treatment 2 Treatment 3 


5 18 27 15 
А 17 24 14 
| 14 13 12 
5 5 8 6 
6 11 14 10 
E 9 12 8 
8 14 16 15 
5 12 17 9 
n. 22 21 16 
10 18 15 
и 
the usual analysis of variance is performed, a 


5! 
"mmary table (see Table 6) is found. 


TABLE 6 


ANALYSIS 
OF VARI 
SUBJECTS DESIGN ANCE FOR TREATMENT X 


Source of 


МАЕ Degrees of Sum of Mean Е 
lation Freedom Squares Squares 
Treg 
atments 2 136. 27 68.13 11.52 
Subj 
: тен 9 521. 20 51.91 
rror 
т 18 106.39 5. 91 
ota 
M 29 763. 86 
On the 
regression ү, hand, if the problem is viewed from 
tages, Firg! с DAC а solution can be obtainedin 
ij ery imply Ms Full Model is to be constructed. 
Ormation avail ull model will contain all of the in- 
eat ilable for a given situation. For the 


? subj Я 
= available ioe design, two bits of information 


ine involved am n Criterion score, the treat- 
irteen binary Ree ee Subject involved. If a set of 
\ Кын ictors are constructed, the full 
=й 
0+ byx 
1^1 * b, 
1* Dox) + bg Xg + b4X4 + bsX5 + 


6X6 + bax 
"Ха + рх 
7* DgXg + boXg + bioX19 + b11X11+ 


b 
12X15 ыз, 
Where, 13 


13 L 4 er 
= the Criterion Score 


X 
T= Lif th 
€ score is from subject number 1 $ 


0 otherwise, 


X» through X4 are defined similar to Xi, 


ж 
" 


lifthe score is from a member of treat- 


11 
ment 1; 0 otherwise, 
Xi = 1 if the score is from a member of treat- 
ment 2; 0 otherwise, 
X,4,-1ifthe score is from a member of treat- 


ment 3; 0 otherwise, 
€, = the error in prediction. 


The regression coefficients are defined in terms of 
the corresponding predictor, with bg being the y- 
intercept. The information from the preceding can 
be conveniently put into a table (see Table 7) so 
that it can be readily put on IBM cards. 


As happened before, we actually have too much 
information. That is, Xio is determined by the pre 


vious nine predictors. Thus, it can simply be dis- 
carded as a predictor. This is the same as bio - 


0. Likewise, X5 is dependent upon Хуу and Xi 


and can also be discarded as a predictor. Thisal- 
so is the same as setting Буз = 0. The full model 


will then use eleven predictors and be referred to 
as selection one. It will also be called Model I. 


Two restricted models will be defined. First, 
ten binary predictors can be constructed, one for 
each subject, and then the last predictor is deleted, 
the regression equation becomes: 


Y = b44 + bj5X4 56 + 017Х3 + b18X4 + bygXs* 
озо + bgjXq + 0528 + bagXg * 65 


where: 

Y - the criterion variable, 

Xi through Xg are defined the same as in the 
Full Model ( Model I), 

514 = the Y-intercept, 

bis = the regression coefficient corresponding 

to Xj, 
big through bgg аге similarly defined, 


ез = the error іп prediction for subjects. 


This formulation can be called Model П and will 
yield the subjects effect. 
Finally, a model (Model Ш) can be defined for 
treatments (columns) and is 
Y = bo, + оуу * ®26®12 * °З 


where: 


ү =the criterion score, 
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TABLE 7 


REGRESSION FORMULATION FOR TREATMENT X SUBJECTS DESIGN 


Y Х| х 5; X4 XQ Xs X Xg Ху ху) Xu Х Ху 
в i ü 0 о 0 o о 0 0 0 1 0 0 
sz 1 0 0 о 0 0 0 0 0 0 0 1 0 
1:5 аж 0 0 о U —m T 0 0 0 0 0 1 
iy d a 0 о 0 о о o 0 0 1 0 0 
24 о à 0 о 0 o о 0 0 0 0 1 0 
мо 1 0 о 0 0 0 0 0 0 0 0 1 
d oO 1 0 0 0 0 0 0 0 1 0 0 
w 0 7 1 о 0 о о 0 0 0 0 1 0 
w 0 y 1 о 0 0 о 0 d ў б Я i 
ЕЖ % ~ — 1 90 о о 0 0 о 1 0 0 
"une D. i зз & y 0 о о 1209 
v. жа p : 0 0 0 0 0 0 0 0 1 
Hn п о 0 0 1 0 0 0 0 0 1 0 0 
w 9 " 8 4 1 0 0 0 0 0 0 1 0 
m m 0 6 5 1 0 0 0 0 0 0 0 1 
pw un 0 90 09 0 1 0 0 0 0 1 0 0 
wm 9 р 0 0 1 0 0 0 0 0 1 0 
* "un % g 9 1 о о о о 0 0 1 
diis ы 9 ы, 0 1 0 0 0 1 0 0 
em i Р ы 9 9 1 0 0 0 0 1 0 
2 г g 9% ш 0 1 0 0 0 0 0 1 
d = NS ч т y 1 о о 1 0 0 
m s А " s 9 0 y 1 0 0 0 0 
Ж, i и j 9 p 0 1 0 0 0 | 1 
- oS " а i á 9 0 0 1 0 1 н 0 
л о 0 0 о Ü 4 в | : | | | à 
гы ы А ч a 0 0 0 1 0 0 0 1 
10 о 0 0 о © ря i 
в о о о о 0 0 p o ‘ | 1 0 0 | 
5 0 0 0 0 0 9 1 0 


, 


о 
at th 
, Ment Broupe Broups ar 


WILLIAMS 89 


Xu and Жы аге defined as before, 


bog = the Y-intercept for Model ш, 


bos = the regression coefficient for Хур 
bog = the regression coefficient for Хі» 
еҙ = the error in prediction for treatments. 


The variable Xis was deleted here for the samerea- 


RUIT that it was deleted in the Full Model; that is, it 
d oduces no new information, and is thus redun- 
Cant. Model III will be called selection three. 


кесіле three selections, each of the sources of var- 
ae on can be separately determined. After combin- 
Е Е results of the three selections, essentially 
ШІ Pee results as found in Table 7 can be gather- 
Ж сс Шацу, апу опе of the selections could have 

| mitted and the other portion could have been 
Ound as a residual, 


tio Using the multiple correlations from each selec- 
n, the following additional results were found: 


(а) With the fun model, R-.92774, and R? = 
8607, 


(b) With the 9-predictor model system, that is, 
the “subject” effect, В -.82602, and R? -.6823. 


(е) With the regression model for treatment ef- 
fects, R =. 42236, and R2 = , 1784. 


Hence 


forth » it can be seen that the “subjects” accounted 


e most variance, 


with dis dus entire analysis has been concerned 
Seribed hens the treatment X subjects design de- 
Y Lindquist, exactly the same set of linear 


mo { 
disi? Could be written for the randomized block 


OTHER DESIGNS 


sade ee designs such as Latin squares 
ized ап = -plot designs can also be conceptual- 
Schmid (9 mpleted using a regression approach. 

e analysis E demonstrated by an example that 
a regression > Covariance can be formulated from 

© approach approach and he has shown that the 
of the simpli 68 yield an identical F value, Because 
ysis of coya City of the regression approach to anal- 
here, l6 ee it is worth further discussion 
“analysis of i should be pointed out that the name 
by an a ion Covariance” may not specifically be used 
tenberg and Waits the regression approach. Bot- 

» "Ard (2) use the term *'concomitant 


The an 
atest 
У Succe nen Covariance can be accomplished 
tnSTession Bronte, of the general purpose multiple 
the. Searcher гат. Suppose, for example, that 
ese ictor shes to use two covariates. Let 
notated Хі and Хә. Supposeal- 


e КЕШЕ used as the treat- 


full model can be given as: 


Ys bo eb 
IX] + bX epa, + еі 


where: 
Y = the criterion score, 
Хү = the first covariate, 
Xs 7 the second variate, 


х; = lif the score is from a member of Group 
1; 0 otherwise, 


X4 1 if the score is from a member of Group 
2; 0 otherwise, 


€; = the error involved in prediction for this 
model (the Full Model), 


bg = the Y-intercept for the Full Model, 
b; through b4 are regression coefficients for 
their respective predictor variables. 


It should be noticed that the third group has sim- 
ply been identified by not having membership in 
groups 1 and 2, as was done in the regression con- 
ceptualization of 1-way analysis of variance. With 
the covariance design, it is useful to utilize the в? 
value directly from the printout of the multiple re- 
gression program. 


A restricted model can be formed, using only the 
covariates: 


Ү = bs + bgX + bzX» + еҙ 
Then the test of significance between the full and re- 
Stricted models can be completed: 


(R?F M- n? nuy/a 
= n 


(1- Rp y)/dfg 


The сл M is a symbol for the R? term from the 


full model, and Wo is a symbol for the R? term 
from the restricted model. The degrees of freedom 
in the numerator (4) is K-1 where K is the num- 
ber of experimental groups. The degrees of free- 
dom in the denominator (dta) is N-C-K where N is 


the number of subjects, C is the number of covar- 
iates, and K is the number of experimental groups. 
It should be pointed out that this formulation does not 
provide a test for interaction. 


Actually, almost any existing analysis of variance 
design can be conceptualized and completed by using 
the general purpose multiple regression program. 

If the regression approach had nothing else to offer, 
the ease of using the computer as an adjunct to prob- 
lem solving would be worthwhile in itself. 


However, a most important point needs to be con- 
sidered with experimental design and, especially, 
formal courses in that subject. A potential student 
might rightly assume that one expected behavioral 
outcome of his having pei a course in aF 

ign i at he can design an experi. E 
bee d Ee 25 Addleman’s (1) are а true reflec- 
tion of the actual status of that course, the student 
instead learns about several well-known existing 
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designs but does not actually reach the stage of flex- 
ibility to design his own research. 


While it has by no means been demonstrated here, 
once the researcher becomes familiar with the use 
of the regression program, he can take at least one 
step toward his own experimental design. For m 
ample, suppose а researcher were interested, ог 
research purposes, іп а curvilinear interaction: Е 
he were to follow the routine of searching existing 
programs for such a situation, he may well change 
the course of his experimentation so that he could 
use a more familiar design. If, on the other hand, 
the researcher was familiar with both the lin ear 
models and the general purpose multiple regression 

program, he could pursue the question of interest. 


It should again be emphasized that the content of 
this presentation has assumed only two things: the 
reader has a background in experimental designand 
has access to a general purpose multiple regression 
program likely to be found at any modern computer 
installation. As indicated earlier, other approaches 
to problem solving by multiple regression also use 
other subroutines. It would be useful, however, for 
the researcher to try the multiple regression pro- 
gram already available to him. The versatility of 
the program can soon become apparent. 


Two articles of related interest would be useful 
in pursuing the multiple regression approach. Co- 
hen (3) gives some insight into the difficulty re- 
searchers have had realizing the use of multiple re- 
gression for problem solving. Ward (10) discusses 
in some detail, four different approaches that re- 
searchers with different statistical backgrounds would 
use as they relate to experimental design. He also 
argues effectively for the 


usage of multiple regres- 
sion as a method of problem solving. 


Assuming the reader has attempted the use of the 
general purpose multiple regression program, he 
might ask, **What is next?" The two articles men- 
tioned in the previous paragraph are a good starting 
point in the use of the regression approach to prob- 
lem solving. Depending on the reader's interestand 
background, severalother Suggestions mightbe made, 
It would be worthwhile to co 


nsider coursework de- 
signed to help research workers formulate models. 
Besides the sources already given, another excel- 


lent text is one by Mendenhall (8). While it would 
be fair to say that the points of view of the various 
authors in the sources listed are not identical, they 
retain the same flavor of using regression and lin- 
ear models for the solution of research questions, 
SUMMARY 


The intention of this presentation has been to in- 
troduce the educational researcher who typically has 
a background in experimental design to the regres- 
sion approach to problem solving. With the increas- 
ing availability of computers, the researcher can 
have an expectancy of having much more flexibility 
in the design of his own research, 


With the availability of severa 


the researcher may not be aware 
of the multiple reg i 


more common typ 
from a regressio 


1 stock programs, 


examined 
п approach. This has been done so 


that the researcher might become caminena ШИН 
approach will include the designs he is fami сен 
but from а different viewpoint. The most imp (eue 
contribution occurs aíter the researcher EM A 
with the multiple regression program. Then yta 
truthfully design his own research rather than 

fit some existing design. 


This presentation has concerned itself only Ж im 
the general purpose multiple regression prog x ite 
which is likely to be available at any modern eni be 
er installation. Becoming familiar with this prep 
might be considered a first step toward more ne 
ibility for the individual researcher in his expert 
mental design. 
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EXPERIENCE, SKILL, EXPRESSED FEAR, AND 


EMOTIONAL REACTION TO MOTOR SKILLS 


PERFORMED UNDER CONDITIONS OF HEIGHT 


pleas 
leve] 
trial 
еп, А 


4 : enerally assume that pleasant past 
| Prahe level of aaa and the Шырша» maa 
| ively influen nique are two factors that may pos- 
new Skills, Gace individual’s willingness to attempt 
єз come to the eee it is thought that when stu- 
stucilures resume ШЕ experience with ahistory 
ie ent may be feni unpleasant experiences, the 
ЕН е In the activity, = and quite hesitant to partici- 
«аана classes whee Li true in physical 
5 h 2 
| cy in nena ee ran mao 
Орг ven ed ERE the first attempt in order 
+ This is particularly true in those 
енден at a height. Some stu- 
S, Seem to be preoccupied with 


e 
| the 4? "Ore tha 

| sprPosstbility that 
oard diving 
ally i S and tram 
for t Njurious and 


to note that a student’ 

e ser А lent’s co- 

nera ted on [seen d. ere quite satisfactory 

e flor е rapidly when te ene progression— 
Е е skill is attempted above 


е detrim 
П Motor Biden effect that fear or anxiety has 
h een ides 18 à very real phenomenon 
led by Wherry (4,5) as antic- 
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Ё 
ABSTRACT 


This А 

Оп two motor д determined the extent to which Ss, grouped оп the basis of excellent, average ог poor 

ant experianocs ante ee 4 feet above the floor, could be differentiated by their previous pleasant and un- 

; running skill, and c height, expressed fear of height, emotional response to height, risk-taking activity 

5 of these tests за Тр ыы skill. Six trials of running апа cross-stepping оп a balance beam, three 

m multiple discrimi eet above the floor, and a self-report inventory were completed by 139 college wom- 

9r skill and their nant analysis indicated that the criterion groups were differentiated on both tasks by their 
current risk-taking activity level, but not by their past experiences or expr essed fear of heights. 


ipatory physical threat stress (APTS ). Moderate lev- 
els of APTS apparently enhance performance, while un- 
usually high levels result in disrupting behavior. Com- 
plex responses probably deteriorate under APTS 
more than simple responses. When Ss are provided 
with an opportunity to avoid physical threat by skilled 
motor performance, their performance is enhanced (2). 


The levels of APTS generated are described asa 
function of an individual's past experience, which in 
turn patterns his perception of the probability of phy- 
sical threat, the proximity of physical threat, andthe 
unpleasantness of the event. A factor that apparently 
generates substantial levels of APTS in some stu- 
dents is the requirement that a motor task be per- 
formed at a substantial height. The fact that there 
is considerable variation in students" emotional re- 
sponse to a potentially dangerous task evokes an in- 
teresting question. Why are some students terrified 


at the thought of moving their bodies through space 
high above the floor, and other students see it only 


as an interesting task? What combination of vari- 
ables might enable a teacher to predict performance 
that will degenerate under APTS ? 


It is tempting to hypothesize that an individual 
who has had pleasurable childhood experiences with 
height, such as playing in swings, jungle gyms, and 
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erceives height as adding to the enjoy- 

кес. dí the motor task. It is also tempting 
to suggest that if in addition to these pleasant iam 
riences an individual visualizes himself as skil E T 
motor activities, he is likely to anticipate the eveni 
of falling to be of low probability. Another possibil- 
ity is that a person who has been accustomed to play- 
ing at heights and perhaps falling occasionally, may 
not perceive the event oí falling as having the same 
degree of unpleasantness as that person who was pro- 
tected and who currently rarely places himself inun- 
stable positions above the floor. 


It would be quite beneficial to know whether 
knowledge of skill level and past experiences with 
heights could enable the physical educator to predict 
initial and tentative early performances that will de- 
teriorate when performed at a height. If past experi- 
ence and skill level were predictive, injuries could 
be avoided. In addition, this concept would certain- 
ly be considered by the constructors of elementary 
physical education curricula. 


It would also be helpful to know whether knowl- 
edge of a student's self-reported emotional or phys- 
iological reaction to height as wellas his current 
selí- initiated risk-taking activity level would provide 
the educator with information that would predict early 
Successes or failures, Although many physiological 
responses have been categorized as indicators of emo- 
tional arousal (3), it is not known whether thesere- 
sponses are predictors of behavior under APTS, 


Human behavior can no longer be explained in 
terms of single variables as well as it сап іп terms 
of multiple variables interpreted in relationship to 
each other. The specific Contribution of experience, 
Skill, and emotional arousal to motor performance 
under height conditions is not the question of import, 


but rather, how the variables of past pleasant and 
unpleasant е; e task, emo- 


Xperiences, motor skill in th 

tional response to the height, and current risk-taking 
activities interact to determine individuals' in itial 
performances of a motor skill high above the floor, 
Finally, does the complexity of the motor task differ- 
entially affect the use of these descriptive variables in 
predicting perfor manceunder task-height conditions? 


PROBLEM 


This study determined the extent and manner in 
which Ss, grouped on the basis of excellent, average 
or poor on two dynamic gross motor tasks performed 
4 feet above the floor, may be differ entiated by the 
following set of descriptive variables Operating to- 
gether: pleasant experiences with height, previous 
unpleasant experiences with height, expressed fear 
of height, emotional response to height, risk-taking 
activity level, running skill, and cross-stepping skill, 
PROCEDURES 


Two basic types of data were obt: 
terion scores that reflected Ss’ ability to perform a 
motor task 4 feet above the floor, and the descr iptive 
scores, that placed the S on a continuum for each of 
seven variables. 


ained: the cri- 


Criterion Scores 


College freshmen and sophmore women (N=139) 


were the Ss of this investigation, and completed ee t 
trials at each of two gross motor tasks. Both Haste 7 
involved traveling the length of a 4-foot Шаһ Бајат i 
beam as quickly as possible. Subjects began by 5 p 
ing on the end of the beam in such a way that eu. 
depressed a microswitch. When the 3 began the And 
by lifting her foot from the microswitch, a Wr 
Scope was activated and continued timing unti % РЕ. 
croswitch at the opposite end of the beam wae 
pressed by the S. The two tasks were (a) punning ГА 
length of the beam, and (b) traversing the beam ШЕ 
а cross-step sideward run. The within-day relia Tent) 
coefficients were . 94 and . 86, respectively. Елеке 
average, and poor groups were delineated onthe 

of performance on each of these tasks. 


Descriptive Variables 


Pleasurable and unpleasant experiences that Б ; 
had had with heights, their expressed fear ot nig 
places, and their current level of pantichpallc са 
risk-taking activities were estimated by sub-scor үй 
оп a self-report inventory. Emotional response и 
height were also obtained by a self-report invent 
which was administered immediately after Ss һа a 
versed the beam for the first time at its regula 
height of 4 feet. In responding to the emotional P n 
sponse inventory, Ss placed themselves on à E 
Scale for each of fifteen physiological autonomic com 
vous system responses commonly reported as ас 
panying fear or stress anticipation. 


el 
Motor skill was defined as the performance ley 
of the S on the running and cross-step tasks prr 
the height factor. The beam was placed flat on t ы 
floor and Ss were timed throughout three trials E 
both tasks. Within-day reliability coetficiet | 
were . 90 апа. 87, respectively. Scoring and йа th 


procedures were identical to those used througho" 
criterion tasks, 


= 


Testin; Schedule 


: 4 ar tas 
and Subjects first became familiar with the skillt n 


laboratory equipment by completing, individu? ү 
under experimental conditions, three trials on t er? 
ning tasks and the Cross-step task, These data im Я 
hot utilized, since the trials were undertaken P”! re 
ily to familiarize the 8 and to stabilize the dras oo" 
duction of time between trials one and two th? 
curs as a result of learning (6,7). 


One week 1 

tained by proc. 
in the 
invent 


е 
f 


beam at its full height. Upon descending, » 
complet. l response inventory: ; inf, 
perfor ined dU* чуй 
Scores were obtaine гет“ 

ompl ialsO ш) 
ning task and three tri epee \ 


x 
: All Ss were individua tf 
i Vestigators, and prior to this last t° kel 
Session, Ss had no knowledge that they would peas 
Perform these Skilltasks ata height of four feet- 
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ANALYSIS 


eee ere described, on the basis of the data 
tein neste the 6-week period, in terms of 
abi pel Som оп a continuum for each of seven vari- 
might sing! Re postulated 3 priori as variables that 
РЕГ ТӨРДЕ p^ in combinations describe their motor 
me esa above the floor. The high running 
SCORE то раак Scores, serving as the criterion 
вее ДЫ d ing Ss’ ability to perform under the 
criterion P ‚ Were used to divide the Ss intothree 
Poor (О,) TE excellent (Q, ), average (Qz, з), and 
Vsls 524 bos p теда, А multiple discriminant anal- 
Which the aus о ctermine the extent and manner in 
е category (s dS арар might also describe 
y «Ig under the Stress cf hates npe машина 


The di imi 
Be сове ас. олан analysis used in this study may 
feation El ae as an extension of a single-classi- 
neously see of variance which includes simulta- 
significant ah variables in order to determine whether 
duces diser imine differences exist. The technique pro- 
variables un ae functions that include the original 
Weights, Sed IScriminate the groups; discriminant 
Variable to the ie € the relative contribution of each 
а, паса * Iscriminant function; a Wilkes Lamb- 
esis of елин | probability level for the null hypoth- 
_~ Variance eu СОВ centroids; and a percent of 
criminating Балу indicating the percent of totaldis- 
iscriminant Ізі the test battery contained in each 


RESULTS 


High Runnin Task 


AS ma 4 
апа poor ау ^ ееп in Table 1, the excellent, average, 


task Wene ое 48 categorized on the high running 
Cludes their ED differentiated by a function which in- 
Pent risk t emotional responses to height, their cur- 
‚+ and their cross сш Уу level, their running skill, 
les, in ter ep skill. These descriptive vari- 
criteri ms of their ability to categorize thethree 
d Combined to be highly significant pr e- 
Ounted for almost 92 percent of the 
= were, therefore, substantially 
ariables s of Ss performance at a height 
bleasant. 4 past experiences—either pleas- 
ER - Although pi nd the Ss? expressed fear of high 
Presseq ear ori easant experiences with heights and 
ction high places combined to form adis- 
› İt was not a significant function. 


The critans 
a = rit 
vane Sistentiy jn 8roup means were distributed in 
tables Which pis Pattern on those four descriptive 
ix 5 Criteri im ined to discriminate amongthe 
ont" at heights "PS (Table 2), The excellent per- 
Bro, Гезролвев герог{еа significantly fewer emo- 
While © Perspiratinn В aS tachycardia, cold palms 
The € they in ER Dn nausea, thanthe average group, ” 
Gup oes fewer than the poor group. 
hg Sr taking ate that they participated in 
їй М1, Such as water skiing, rid- 
a А 5 e and ski Jumping, than the oth- 
While у 9n the low р Xcellent group was also more 
É ing task than the average group, 
nXPerigne Eroups did 7^ VAS Superior to the poor group. 
ids S With hej, Ot appear to have different past 
en о 808, as is indicated by the al- 
P means in Table 2. This finding 
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TABLE 1 


DISCRIMINANT AXES AND COMPONENTS FOR 
GROUPS CATEGORIZED ON THE HIGH RUNNING 
AND HIGH CROSS-STEP TASK 


High Running High Cross- 
stepping 
Components Discriminant Discriminant 
Weights ® Weights b 
Discriminant 
Axis I 
Emotional 
Response to 
Height . 63 .35 
Risk-taking 
Activity level -.49 -.46 
Running Skill 84 . 68 
Cross-step 
Skill . 60 .91 
Discriminant 
Axis II 
Pleasant 
Experience with 
High Places . 76 . 50 
Expressed 
Fear .36 :59 
а, 2 
= A= .463 F = 7.39 
Total trace = 100 о 6: 14,260 
Р-> ‚001 
Axis I = 94. 90% ofVariance. x"= 83.71, df = 8, 
P=> .001 
Axis II = 5.10 %of Variance.X °= 6.10, df = 6 
Р = .410 NS 
=100 ФА =. 465. Е = 8,67 
bTotal trace Or 14, 260 
P= > .001 
Axis I = 98.33 % of Variance. Y^-100.14 df = 8 
Р => .001 
df= 6 


Axis II = 1. 67 % of Variance. y? = 2.51 
P = .870 NS 


may be illustrated by the fact that, of those Ss who 
reported they had broken a bone as a result of afall, 
sixteen were in the excellent group, eight were inthe 
average group, and eleven were in the poor group. 
There apparently was no pattern of differentiation in 
unfortunate or unpleasant experiences in the child- 
hood of the Ss among the three groups. The express- 
ed fear of height was essentially the same among the 


three groups also. 
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TABLE 2 
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MANC 
MEANS, UNIVARIATE F RATIOS, AND PROBABILITY LEVELS FOR GROUPS CLASSIFIED BY PERFORM 


ON HIGH RUNNING TASK 


Descriptive variables 


Excellent Average Poor F5. 136 Ratio 

(N = 34) (N= 71) (N = 34) 
cae dnd 10. 90 12.10 12.70 4.61 
Unpleasant Experience 4.91 4.80 4. 66 ‚12 
Pleasant Experience 3.85 3.81 3.68 2.15 
m^ meee ot 2.30 2.47 2.44 1.22 
Risk-Taking Activity 
Level 10. 85 8.96 8.19 8.45 
Running Skill 1.25 1.47 1.67 22.12 
Cross-stepping Skill 2.82 4. 85 5. 64 53.01 
Group Multivariate Means 
Axis I . 2.30 2.77 3.26 
Axis П 6. 06 6.32 6. 06 
TABLE 3 


MEANS , UNIVARIATE F 


RATIOS, AND PROBA 
ON HIGH CROSS-STEP TASK ЭШ 


Descriptive variables Excellent Average Poor dH 
(N= 35) (N=67) (N=37) 3 jig co 
Emotional Response 
to Height 10.39 11.91 13. 63 15.23 
Unpleasant Experience 4.82 | 
4.81 4.72 
, ‚14 
Pleasant Experience 3.78 
: 3.85 3.67 
. 2.58 
Expressed Fear 2.28 2.46 2.4 
A 48 1.57 
Risk-taking Activity 
Level 10.11 
. 8.95 8.13 
_ 8.82 
Low Running Skill 1. 23 1.45 
қ 1.73 33.90 
Low Grapevine Cross- 
stepping Skill 3.49 4 
* . 43 
5.31 13.80 
Group Multivariate Means 
Axis I 2.30 2.77 3.26 
Axis П 6.06 6.32 6.06 


С 
ITY LEVELS FOR GROUPS CLASSIFIED BY perFoRMA 


y 
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WYRICK 


sings, Еді were categorized into excellent, av- 
scores onthe € Performers оп {һе basis of their 
again only one i e complex high cross-step task, 
(eee Tablas ) SUI iminant function was significant 
С. thes nt group of variables accounted for 98 
risk-taking pe ёс» and included only the Ss’ current 
Stanek Ane vi y level and their running апа cross- 
their EXT EROR ў past experience with heights, and 
for oniy 2 secret e of height combined to account 
values, displa 90 ot the total variance. Significant F 
Biema obi iin in Table 3, indicate group differ- 
боп] Seen with these findings. Although emo- 
oe ны зыр о height did not load high enough on 
that cluster кы function to be considered а part of 
ly different A. КЕ БІР means were significant- 


DISCUSSION 


Subj 
amg ee Were grouped on the basis of their 
floor were QUE motor performance 4 feet above the 
current кісе erentiated by the descriptive variables, 
the tasks, ор taking activity level and motor skill in 
task, mid a together. In the simpler running 
groups, Thesen responses also differentiated the 
model that a уне, г are compatible with Wherry’s 
Y high levels dicts complex skill to be more affected 
plex task a КОГ APTS. In this study, the more com- 
ferentiated p, perce Ping at heights was best dif- 
Dy the Ss’ skill at cross-stepping rather 


ап һу е A 
weights of sates Tesponses. The discriminant 
its contributi nat are shown in Table 1 indicate that 


рое discriminating power of the vari- 
Coliplex play the most predominant role 
and life s m: In other words, emotional 
Performa yle may be significant in predict- 
more con nce under stress, but as the task 
a more каро, Ss’ actual skill in the task 

a more important role. 


ables ma 


Hg Broup 
Jecomes 1 
aSSumes 


The а, 
üssu i 
Pleasant tha оа that providing individuals with 


will een experiences in their childhood 
maneo in бери" ыыы and successful рег- 
porties appears нЕ Physical education risk-taking ac- 
ley oe Ne conce, iene There is, however ,sup- 
aiid 5 of Skill in 2 that persons who have moderate 
Mati Stressful situa ü area will perform in unfamiliar 
d ап an (i RR requiring that particular skill 
i. unskilifu] Koer who is unskillful. Individuals 
deters of bwb 5 apt to generate unusually high 
orate more, ch in turn cause performance to 


result in 
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The findings of this study indicate that the educa- 
tor can best predict a student’s behavior under the 
stress of height by knowing the student’s skill and ac- 
tivity interests, rather than by knowing his background 
or by attending to his fears regarding the task. It fur- 
ther suggests that it is probably not the occurrence of 
isolated experiences, such as building tree houses and 
swinging on tree ropes into the swimming hole, that 
are important in determining the student behavior. 
Rather, it is what the child who participates in these 
activities becomes as a result of them. It substanti- 
ates the fact that every individual should have the op- 
portunity to develop motor skills to such an extent 
that he perceives himself to be capable of any phys- 
ical demands which may be placeduponhim, whether 
these demands occur within the framework of a stress- 


ful situation or not. 
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ABSTRACT 


An analysis of the formal educational process utilizin, 
to identify and define the major input components which acc 


CURRENT TRENDS toward increased com- 
mercial production of educational materials makes 
rational models for use in the analysis and thorough 
examination of the educational process extremely 
timely. Although research in education began some 
73 years ago with the early survey studies of Rice 
(8), there has been a meager amount of consistent 
educational information and knowledge accumulated 
about the interaction of major input components and 


how these input components interact to affect pupil 
learning. 


To a large extent, the failure of educational re- 
Search to contribute large consistent bodies of knowl- 
edge about the educational process has been faulty 
experimental design (4). Another important factor 
Contributing to the lack of consistent educational in- 
formation has been the failure to realistically con- 
Ceptualize a usable model which identifies and con- 
cisely defines the major input and output factors of 
the educational process which must be considered 
(either controlled or systematically varied) in the 
design of educational experiments. Thus, the follow- 
Ing analysis of the educational process and a concep- 
tualization of a basic model for use in the design of 
educational experiments was undertaken. 


MAJOR INP 


UT ELEMENT, THE FORMAL 
EDUCATIO "NY 


NAL PROCESS 


w өле of the factors which has directly contributed 
peri ts ША informational output of educational ex- 
ments is the failure to Consider all of the major 


p." 


input elements of the educational process. This is 

in part attributable to the failure of previously pro- PT 
posed theories to separate and Specifically define 

Some of the more pertinent factors which affect var- 

ious types of learning. Six major components in the 
educational process uniquely and in combination have | 
been shown to affect learning. These input compo- 

nents are: curriculum (C), instruction (1), teach- 

er(s) and/or implementor(s) (T), learner(s) (L), 

media (M), and environment (E). 


Traditionally, the tendency has been to combine 
or singly define in global terms curriculum and in- | 
struction. MacDonald (7) has suggested the impos- | 
sibility of defending such а position, however, from 
а systems point of view. Both the curriculum com- 
ponent and the instruction component make unique 
and independent contributions in the input ртосева. 
resulting іп different types of learning. Evenso, ы 
research studies have been designed which separa 
these two factors. The results of two experiments 
in mathematics education (1,2) which did Separate 
{һе curriculum and instruction factors, support th | 
Macdonald (7) contention, 


" 
The curriculum component is uniquely defined гара 
Content ог subject matter information which is ОРЕ net 
ed into communicable form tobe conveyed to the ee ші 
in the form of an educational program, The HEP > 
then, has such general characteristics as Beg Ee the : ЕА 
ture, organization, and sequence. Scope refers efers 4, 


г EN 
exact content included in the program. Structure | 


i nized . ” 
to the elements about which the content is orga! 
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(e. g.; areas of discipline, general topics, broad gener- 
alizations, or problems). Organization refers to the 
characteristics of ordering the structured content (e.g., 
Spiraling-the revisiting of topics, areas, or ideas at 
increasing levels of sophistication at successive in- 
tervals throughout the program). Sequence refers to 
the order of the content.in the program. 


The instruction category of the model uniquely in- 
cludgg all different methods or approaches of im - 
plementing the curriculum. The discovery method 
and the expository method are examples of two of 
the more predominately used and supported instruc- 
tional approaches. Another set of strongly support- 
ed instructional techniques are the behavioral modi- 
fication strategies and methodologies. Although 
research on instructional methods has been vast, the 
majority of the studies done have confounded the in- 
struction factor with the curriculum factor (1). The 
same curriculum, thus, can be implemented utiliz- 
ing different methods of instruction. 


The teacher and/or implementor-is yet another in- 
put component in the educational process model. The 
teacher (implementor ) category of the model uniquely 
includes all of the different implementors of the cur- 
riculum. Theteacher-implementor being human tak- 
es on all of the human characteristics of personality, 
intelligence, aptitude, attitude, etc. The implemen- 
tor, however, in our more modern age of technology, 
may be a machine. This may rangefrom a slide pro- 
jector to the more complicated computer-linked ap- 
paratus. Inmany cases researchhas used a machine 
as the implementor and attempted to generalize to 
teacher (human) implementors. There are obvious 
differences which should be taken into account in de- 
sign of and generalization from research. 


The learner category of the model uniquely includ- 
es all receptors involved in the formal educational 
process. The learners are characterized by all fac- 
tors known about the human tobe critical to the learn- 
ing process. Therefore, possible learner variables 
would be intelligence, personality, aptitude, sex, 
general physical characteristics, and attitudes. 


A fifth major input component, environment, has 
been suggested by the work of the School Environment 
Research Project (6). This dimension would include 
such variables as classroom social climate, physi- 
cal climate (e.g., spatial arrangement of furniture, 
boards, apparatus, temperature of the room, humid- 
ity, lighting, extraneous noise levels, etc.), and size 
of the group interacting in the learning process. Sel- 
dom has this dimension been given enough credance 
in the design of educational experiments in classroom 
situations. Too often, one treatment has been imple- 
mented in one classroom and another in a different 
classroom. Differences in learning could either be 
attributed to differences in environmental factors be- 
tween the two classrooms or between the two treat- 
ments. Little, if any, information about the educa- 
tional process is gained from such poorly 
conceptualized experimentation. 


The sixth and last major input component, media, 
has been suggested by the continual increase in the 
number of different types of instructional materials 
which are currently available for use in the class- 
room. The elements which would be included in this 
category would be such materials as textbooks, trade 


books, films, slides, filmstrips, manipulative aids 
(e.g., blocks, counters, science equipment), gam- 
es, and charts. Media is not to be confused with 
curriculum. Curriculum is the content as it is or- 
ganized, delimited with respect to scope, sequenced, 
and structured within the media (e. g., textbook, film- 
Strip, film, etc.). 


MAJOR OUTPUT ELEMENT OF THE FORMAL 
EDUCATIONAL PROCESS 


The major output element of theformal education- 
alprocess is learning. The general criterionof the 
success of the combinational input variables is as- 
sumed, here, to be learning on the part of thelearn- 
ers or subjects involved in the study. Thiscriterion 
is to be considered in its broadest sense and thus 
includes all of the possible types (e.g., cognitive, 
affective, and psychomotor) and levels of learning 


(3). 
INTERACTION OF THE MAJOR INPUT ELEMENTS 


In considering this model for its general use in 
the design of educational experiments, all of the pos- 
sible input element combinations must be conceptu- 
alized (see Table land Figure 1). The curriculum- 
learner (CL) input combination, for example, would 
include experiments designed to determine the cur- 
riculum characteristics which for certain types of 
learners would maximize a desired kind of and level 
of learning. The instruction-learner (IL) inputcom- 
bination would include experiments designed to de- 
termine the methods or approaches which for certain 
types of learners would maximize a desired kind of 
and level of learning. The curriculum-media (CM) 
input combination would include research designed 
to determine the types and kind of media (e. g., film, 
textbook, filmstrip, slides, etc.) which for various 
types of content would maximize a desired kind and 
level of learning. 


The input combination (CILTME) would include 
the most definitive and information-producing ex - 
periments portrayed by the model. Certainly, as 
more information about individual components be- 
comes available, research can more frequently be 
done on these higher order overlap areas. 


This general model can also be done using re- 
search in different subject matter areas (see Figure 
2). After the research is completed there is new 
information about each of the input components which 
can be used to improve and revise hypotheses about 
how the variables within each one of the input areas 
together and singly maximize different types and lev- 
els of learning. 


APPLICATION OF THE MODEL TO THE DESIGN 
OF EXPERIMENTS 


One of the first steps in designing any education- 
al experiment, using this model, is to identify the 
area which will be studied (see Table 1 and Figure 1 Va 
The objectives of the study can then be stated in the 
form of questions to be answered, hypotheses to be 
tested or effects to be estimated. Some of the stan- 
dard designs which have been proposed for use inthe 
design of educational experiments are both experi- 
mental (5) and quasi-experimental (4). In utilizing 
any of these designs, however, the major factors 
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TABLE 1 


MAJOR INPUT COMPONENTS AND 
COMBINATIONAL INTERACTIONS 


Major Components: 


Curriculum (C) 
Instruction (I) 
Teacher (T) 
Learner (L) 
Media (M) 
Environment (E) 


Logical Combinations: 


Way: Two Three Four Five Six · 


Сї CIL CILT CILTM  CILTME 
CL CIT CILM CILTE 

ст CIM CILE ILTME 

CM CIE CITM CLTME 

CE CLT CITE CILME 

IL CLM CLTM CITME 
IT CLE CLTE 

IM CTM CTME 
IE CTE ILTM 
LT CME ILTE 
LM ILT ILME 
LE ILM ITME 
TM ILE LTME 
TE ITM CIME 


ME ITE CLME 


outlined in this model should be taken into account 
so that confounding is minimized, so that meaning- 
ful information about the educational process canbe 
derived from the ensuing research. 


Another critical factor in the application of this 
general model is the generalizability and potential 
power of selected component variables. Generaliz- 
ability, here, refers to the potential use of informa- 
tion derived in future educational and experimental 


problem situations. Power refers to the relative im- 
portance of the single variable in accounting for or 
affecting learning. If a variable accounts for only a 
small percentage of the learning (e.g., angle of the 
child to projection screen; taking a pretest over the 
subject matter to be taught), thencertainly other more 
potent variables (e.g., sequential and/or hierarchi- 
cal structuring of content; reinforcement structures 
and/or paradigms) should be the focus of the educa- 
tor and/or experimentor's attention. 


IDENTIFICATION AND INSTITUTION OF NEEDED 
CONTROLS 


Once the area of research has been isolated, the 
next step is to identify the component areas whichare 
to be controlled. By using the model, identification 
becomes simple. If the CIL areais tobe studied, then 
the external control areas are M, E, and T. The in- 
ternal control areas are С, L, and I. There arefour 
ways to control variation: (1) by the general layout 
of the experiment, (2) the design of the experiment, 
(3) the statistical analysis of the experimental data, 
and (4) the development of the experimental program. 


Consider this example. Suppose the variables of 
concern were learner cognitive development, curric- 
ulum sequence, and instructional approach, thus, plac- 
ing the experiment in the CIL areaof the model. Since 
the generalizability potential of the variables chosen 
for study is critical, Systematic and logical analyses 
should be done regarding each of the overlap or inter- 
action hypotheses of the variables chosen. Suppose, 
for example, the variables chosen in the CIL area 
were: curriculum sequence - psychological sequence 
paradigm versus subject matter Sequence paradigm; 
instructional approach — discovery versus exposito- 
ry; and learner cognitive development — sensori- 
motor stage, stage of concrete operations, period of 
formal operations. In all cases of overlap or inter- 
action, reasonable alternative hypotheses canbe stat- 
ed and supported. Consequently, the potential infor- 
mational output of such an experiment is maximized. 
Furthermore, the general variables selected have 
maximal generalizability potential to the structuring 
of future learning environments. 


Since cognitive development cannotbe randomly as- 
signed to learners, this must be oonsidered a blocking 
variable. Learner(s) consequently must be grouped 
according to their manifest cognitive development lev- 
el and then subjects within the three levels randomly 
assigned to the four treatment combinations, T his 
yields a standard 3x2x2 randomized block design with 
several replicates per cell. 


In order to internally control for other learner var- 
iables, subjects within each block would be, as the 
randomized block design (5) Specifies, randomly as- 
signed to each of the four treatment Conditions (i.e. , 
1) subject matter Sequence-discovery approach, (2) 
subject matter Sequence-expository approach, (3) 
psychological Sequence-discovery approach, (4) рѕу- 
chological Sequence-expository approach). A Sketch 
of the design is shown in Figure 3. 


To control or minimize the effects of other instruc- 
tion variables, the instructional programs must be 
designed so that the only difference between the two 
implementation procedures is the mode of presenting 
the curricula, Thus, either teachers must be trained 
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to implement the curriculum using exactly the same 
techniques in all instances except for the mode of 
presentation or a machine of some type must be pro- 
grammed such that the only differences are in mode 
of presentation. Generalization, naturally, isquite 
different in the two cases. Teachers as implemen- 
tors are quite susceptable to error in experimental 
situations. Even so, the added flexibility that they 
bring to the teaching situation can be brought to bear 
on instances a priori unforseeable when programming 
a machine to implement the program. Certainly, due 
consideration should be given to the advantages and 
disadvantages of these two alternatives. 


The external components, those major components 
in the model not being specifically studied, T, M, and 
E, must also be considered. The effects of the Т 
component can best be controlled or minimized by 
random assignment of (human) teachers to all treat- 
ment conditions if the number of teachers is large, 
or random assignment and systematic rotation of (hu- 
man) teachers to the various treatment conditions if 
the number of teachers is small. Another alternative 
is to use a nonhuman implementor or machine. If a 
machine is used, the same type of machine must be 
used across all treatment conditions. 


The M component can be controlled by equalizing 
the use of various types of media across the four 
treatment conditions. If manipulatives are used in 
one treatment condition, then they must be used at 
precisely the same time and for the same purpose in 
each of the other treatment conditions. То insure this 
type of control, a large amount of pre-experimental 
program planning must take place either via pro- 
gramming for machines or training of teachers. 


The E component is much more difficult to control, 
but is causing increased concern on the part of edu- 
cators. The effects of physical characteristics of the 
environment can usually be minimized by equalizing 
the classroom or instructional environment across 
treatment conditions, 


If experiments are to be carried out in regular 
classrooms, the manipulation of environmental con- 
ditions for research purposes may be difficult. One 
solution, again, is the random assignment of exper- 
imental treatment conditions to classrooms and then 
systematic rotation of subjects within treatment con- 
ditions to all of the various classroom environments. 
Another possibility is the utilization of an experimen- 
tal classroom on wheels. In this way, all treatments 
across schools and across classrooms would be in the 
same environment for all treatment conditions. In 
other words, the environment would remain constant 
across treatment conditions, 


The effects of the social environment are not so 
easily minimized. Reduction to a large extent canbe 
handled by randomly assigning subjects to treatment 
groups. One might assume, then, that social inter- 
action patterns of various types would have an equally 
likely and mutually exclusive chance of occurrance 
in each of the various treatment groups. 


The example given here is only one of innumerable 
studies which could be conducted using this general 
research model. Hopefully, as more information 
about the educational process is gained through re- 
search of this type, research in the higher order in- 


teraction areas can be completed. 


To maximize the learning potential of the learner 
involved in the formal educational process, the ma- 
jor input element components must be interacting in 
an ideal fashion at a particular point in time. Ulti- 
mately, there may come a time when the educator 
knows how to amalgamate media, curriculum, in- 
struction, environment, teacher(s) and/or imple- 
mentor(s), and learners to maximize a desired kind 
and level of learning. The information needed сап 
only come through systematic and controlled re- 
search of the formal educational process. 
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ABSTRACT 


A scoring procedure for multiple choice exams which allows for partial knowledge and also allows the ex- 
aminer to control the expected gain due to guessing is considered in this paper. The procedure is considered 
from an elementary game theory approach, Comparisons are given with other scoring methods. 


THE LITERATURE for mental test theory is 
а vast one and interest and research is accelerating. 
Procedures for scoring examinations, particularly 
multiple-choice tests, have received considerable at- 
tention in these research efforts. Among several 
others these include papers by Chernoff (3), Horst 
(6), Guilford (5 ), Calandra (2), and Arnold (1). 
In recent literature (see DeFinetti (4)), the per- 
Sonal probability approach has been considered for 
evaluating the knowledge of an examinee. However, 
the use of personal probabilities is a controversial 
area of scientific inference. Two basic problems to 
be considered in scoring procedures for multiple- 
Choice tests are: (1) How does one handle theguess- 
ing factor? (2) How much of the available informa- 
tion has been recovered? 


In this paper we consider a scoring procedure for 
multiple-choice exams which allows for partial knowl- 
edge and also allows the examiner to control the 
expected gain due to guessing. The pro- 
cedure considered here is presented from an 
elementary game theory approach. The proposed 
method has been successfully used in actual class- 
room examinations in more than one subject matter, 
The results of one such classroom examination, using 
the proposed Scoring procedure, are compared with 
several well known methods of scoring multiple-choice 


tests, Possible explanations for differences between 
the various methods are also given. 


PROBLEM 


We will formulate the problem in terms of asin- 


gle item. The composite score for an examination 
consisting of several items willusually consist of add- 
ing the scores for each individual item. Suppose that 
a question (item) has k possible choices, one best 
(correct) choice and (k-1) distractors, The instruc- 
tions to the examinee will be to give the smallest sub- 
set that he can give of which he is confident contains the 
correct response; i.e., the examinee is simply in- 
structed to mark out as many incorrect responses as 
he can for each item. He shouldalsobe informed (for 
the purpose of discouraging outright guessing) that he 
receives a penalty if the correct or “best” response 
is marked out as an incorrect answer, This proce- 
dure then allows the examinee to receive credit for 
partial knowledge, the credit being greater the more 
incorrect responses than he can eliminate. It will 
readily be seen that such a procedure is simple to 
score. Simplicity in scoring is usually considered à 
desirable property when multiple-choice exams are 
utilized. 


The primary concern here is to determine the ар 
propriate scores that should be assigned to the xe 
ious possible outcomes which may be given by ч к 
aminee, and also to correct for the guessing fac Pis 
which is always an important consideration in mu the 
ple-choice tests. Arbitrarily assigning colos 1- 
various outcomes would be undesirable. We m ех” 
lustrate our approach to this problem by way 0! choit” 
ample. Suppose that an item has four possu are 
es, one of which is correct and three of уе. іп” 
distractors. Suppose also that the examinee as pos” 
structed to delete as many incorrect answers as an 
sible for each item. Looking at the problem 
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elementary game, letus consider the possible ‘‘strat- 
egies” available to the examinee. Suppose that the 
credits (scores) to be given to the examinee for each 
possible outcome are as follow: 


Со» if examinee eliminates по distractors, 


€i if examinee eliminates one distractor, 


Co, if examinee eliminates two distractors, 
Сз, if examinee eliminates all three distractors, 


-P, if examinee eliminates the correct response 
as a distractor. 


The reasonable strategies available to the examinee 
will depend upon his true state of knowledge relative 
to the number of distractors that he is able to elim- 
inate. For example, if the examinee’s true state of 
knowledge is such that he is unable to eliminate any 
of the distractors, thenhe has the choices of (a) sim- 
ply not guessing, (b) attempting to eliminate one of 
the three distractors at random for which he has three 
chances out of four of being successful, (c) attempt- 
ing to eliminate two of the three distractors at ran- 
dom for which his chances are two out of four of being 
successful, or (d) attempting to eliminate all three 
distractors at random for which his chances of be- 
ing successful are reduced to one out of four. 


The available strategies for each possible state 
of knowledge for an item with one correct answer and 
three distractors are listed at I, IT, III, and IV, below. 


I. If the state of knowledge is such that no dis- 
tractors can be eliminated, the possible strategies 
are: 

(i) do not guess and take a sure credit of Co: 


(ii) eliminate one choice at random for a pos- 
sible credit of Сі with probability 3/4 or a 


penalty of -P with probability 1/4. 
(iii) eliminate two choices at random fora pos- 


Sible credit of с, with probability 1/20r a pen- 
alty of -P with probability 1/2. 


(iv) eliminate three choices at random for a 
possible credit of Сз with probability 1/4 or a 
penalty of -P with probability 3/4, 


IL If the state of knowledge is such that one dis- 
tractor can be eliminated, the possible strategies are: 


(i) do not guess and take a sure credit of Ci. 


(ii) eliminate one choice at random for a pos- 
sible credit of C with probability 2/3 ora pen- 


alty of -P with probability 1/3. 
(iii) eliminate two choices at random fora pos- 
sible credit of C4 with probability 1/3 or a pen- 
alty of -P with probability 2/3. 


II. If the state of knowledge is such that two dis- 
tractors canbe eliminated, the possible strategies are: 


(i) do not guess and take a sure credit of Co. 


(ii) eliminate one choice at random for a pos- 


sible credit of C3 with probability 1/2 or a 
penalty of -P with probability 1/2. 


IV. If the state of knowledge is such thatall three 
distractors can be eliminated, then full credit of C3 


is received with probability 1. 


The possible strategies for items with other than four 
choices could be enumerated in a similar manner. For 
the above example of four choices, we will now cal- 
culate the correct scores Cg, Cy, Cg, and Сз so that 
the game of guessing is a **fair кате”; i,e., so that 
if the examinee wishes to guess and gamble for more 
credit at the risk of a penalty, then his expected gain 
due to guessing is 0. Procedures which may be used 
to penalize the guesser so that his expected gain due 
to guessing is negative (i. e., it does not pay to guess) 
will also be discussed. 


We first note that expected gain due to guessing for 
a given strategy is: 


EG - Expected credit when the game of 
guessing is played — Expected credit 
when the game of guessing is not played, 


Where the expected credit due to guessing is given by 


[ credit if win game] P (win game) - 
[ penalty if lose game] P (lose game). 


Hence, for a four-choice item, if the examinee 
cannot eliminate any of the distractors and we require 
а 0 expected gain due to guessing, then for the four 
strategies given at (I) the expected credit is, respec- 
tively 


(4) (Cg) (1) = (P) (03-Со, 

(i) (сі) (3/4) - (P) (1/4) «1/4 (36; - P), 
(iii) (Cy) (1/2) - (P) (1/2) «1/2 (C5 P), 
(iv) (Cg) (1/4) - (P) (3/4) =1/4 (Cg - 3P), 


where -P is the penalty for incorrectly marking the 
correct answer as a distractor. In order that the gain 
due to guessing be 0, we require that the expected 
credit for each of the above four cases be equal to Coy 


which is the score given to the examinee who cannot 
eliminate any of the distractors and who decides not 
to guess. But for a 0 gain due to guessing we must 
therefore require Cg=0 since Сг is the score given 
when the examinee cannot eliminate any of the three 
distractors. Hence, the “fair” score for four choic- 
es to an item is 


Со-0, Cy=P/3, Сҙ-Р, and C3-3P, 


where P is the penalty for eliminating the correctan- 
swer as a distractor. 


For the purpose of illustration, let us suppose that 
three points are to be assigned to Сз (i.e., the credit 
received by an examinee who is able to eliminate all 
three distractors), then the penalty P should be -1 
and the credit assigned to Co, Cy, and C2 should be 
0, 1/3, and 1, respectively. Using these scores we 
indicate below the appropriate credit to be given to an 
examinee for each of the fifteen possible outcomes for 
an item with four choices. An “х? beside a choice 
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indicates that the examinee has eliminated that choice 
as the correct answer. We suppose in the following 
example that the choice “с” is the correct answer to 
the item. 


a. х а. х а. х а. а. х 

b. х b. x b. b. x b. 

c. €. с. 6. e 

d. x d. d. x d. x d. 

3 points lpoint 1 point 1 point 1/3 point 
а. а. а. а. а. x 

b. x b. b. b. b. 

6; е: c. Ci X ©. ж 

d. d. x d. d. d. 

1/3 point 1/3 point 0 points -1 point -1 point 
a. a. а. x а. a. x 

b. x b. b. x b. x b. 

€. ix е, % С. x 5% 6% 

d. d. x d. d. x d. x 

-1 point 


7l point -1 point -1 point -1 point 


From the above illustration it is easily seen that even 


though the number of possible outcomes are several, 
the examiner can determine the correct credit to be 
given very quickly by simply checking to see if the 
correct answer has been eliminated, and if not, by 


counting the number of distractors correctly elimi- 
nated. 


We will now show that these scores are also the 
“fair” scores that should be given for each of the 
other possible states of knowledge considered above. 
For the three strategies given at (II), where the ex- 
aminee is able to eliminate one distractor,the ex- 
pected credits are, respectively 


(i) (Cy) (1) - (P) (0) = С, 
(ii) (Сә) (2/3) - (P) (1/3) - P/3-C4, 
(iii) (Cg) (1/3) = (P) (2/3) -P/3- Cj. 


Hence, the expected gain due to guessing used in 
Strategies (ii) and (iii) is O (i. e., the expected cred- 
it due to guessing is the same as that for correctly 
eliminating one distractor), Similarly if the state of 
knowledge is such that the examinee can eliminate two 
distractors (case III), then the expected credits for 


the two possible strategies are 

(i) (Cy) (1) - (P) (0) -Cs, 

(ii) (Сз) (1/2) - (P) (1/2) x P- C3. 
Of course, if the state of kn 


examinee can eliminate 
ceives the maximum cr 


owledge is such that the 
all three distractors, he re- 
edit of C3-3P. 


We will now consider the proper credit for the 
more general case, Suppose that the item of interest 
has k possible responses 


» (k-1) of them distractors 
and one correct or “best”? 


| 
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tors, then he has the choice of randomly deleting d 
(d=0,1,2,..., k-1) of them for a possible credit i 
of Са, and at the same time realizing the riskof ге- 


ceiving a penalty of (-P) if the best answer is elim- 
inated as a distractor. The examinee’s expected 
credit for eliminating d responses at random is giv- 
en by 


(Ca) (&) - (P) (8), (1) 
where 


Ca - credit if d distractors are correctly elim- 
inated, 


— M 


-P - penalty for eliminating the correct answer | 
аз а distractor, 


k = total number of possible choices for item. 


Equating the expected credit due to randomly guess- 
ing at (1) to 0, since 0 is the credit we would give 
to an examinee who is not able to eliminate any dis- 


tractor and who also refuses to guess, and solving 
for Са, we get 


d 
Са= (Р) (кр). (2) | 
The credit Са at (2) is the “fair” credit, in the 


Sense of 0 expected gain due to guessing, for a mul- 
tiple-choice question with k possible choices, (k-1) 


of them distractors, and -P the penalty of labeling 
the correct answer as a distractor. 


If we consider the special case where partial cred- 
it is not allowed, then selecting only the ‘best?’ an- 
Swer is equivalent to eliminating all (k-1) distrac- 
tors. For this case (d- k-1), the credit is Ck-1 = 


(P) (k-1). Standardizing so that C 


P= 1/(К-1) and the above scoring procedure simply 
reduces to the often used Scoring rule of S- R - [ w/ 
(k-1)], where R= number of correct answers, W= 

number of incorrect answers, and k= number of 
choices. This scoring rule is widely used in exams 
Where the instructions are to encourage guessing only 
if at least one choice can correctly be eliminated. 

Hence, it is seen that this method is a special case 


of the procedure considered here where partial 
knowledge is allowed. 


k-17 b we have 


A summary of the “fair” scores for items with 
2,3,4, or 5 choices is given in Table 1 withthe stan- 


dardized score (based on unity for full credit) writ- 
ten in parentheses. 


We note in Table 1 that the case, where the num- \ 
ber of choices per item is two, includes the true- nis 
false type of examination. Also, for convenience th 
Scores given are standardized so that the RAUM. 
Credit for each case is unity. Hence, if the exam 
er wishes to give a maximum credit greater LES 
unity for correctly eliminating all of the a for 
he simply multiplies each of the possible credi maples 
an item by the maximum credit given. For ae - 
if an item has three choices and the pb woul 
it is to be 12 points, then Cg, Cy, Сә, and - sa 
become 0,3,12, and -6, respectively. of pue \ 
the maximum credit to be given to ап item “eorre’ 
usually be the same as would be given for 5 choos™ 
answer if the usual scoring method of simply 
ing only the correct answer was used. 


-— 
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TABLE 1 


FAIR SCORES FOR VARIOUS STATES OF KNOWLEDGE 


Number of Choices Per Item 


2 3 4 5 
Fair Penalty -P (-1) -P (-1/2) -P (-1/3) -P (-1/4) 
Co 0 (0) 0 (0) 0 (0) 0(0) 
Ci P (1) Р/2 (1/4) P/3 (1/9) P/4 (1/16) 
Cy 2P (1) P (1/3) 2P/3 (1/6) 
бз 3P (1) 3P/2 (3/8) 
C4 4P (1) 


As stated, the scores given in Table 1 are the fair 
scores in the sense of 0 expected gain due to guess- 
ing. If the examiner prefers to discourage guessing, 
then increasing the “Таіг” penalty for specific scor- 
es causes the expected gain due to guessing tobe neg- 
ative; i.e., it does not pay the examinee to play the 
game, and hence, he should not guess. For example 
in the case of an item with four choices, the standard 
scores of 0, 1/9, 1/3, and 1 with a penalty of -1/3 
is a fair game. When P > 1/3, the game of guessing 
will yield a negative expected gain due to guessing, 
the magnitude of which will depend upon how much 
larger than 1/3 the penalty P is specified. Hence, 
the examiner is able to control (or discourage) guess- 
ing to any degree desired by simply increasing the 
penalty for an incorrect answer. On the other hand, 
guessing could be encouraged by making the penalty 
smaller than the “Үзік” value. 


COMPARISONS 


The above scoring procedure was used for aclass- 
room examination in elementary statistics. So that 
comparisons could be made with other Scoring meth- 
ods, the instructions were modified as follows: 


For each question on this exam there are 
four choices for your answer, only one of 
which is correct. You are asked to mark 
out as many of the incorrect answers as 
youcan. The more incorrect answers that 
you mark out the more credit you will re- 
ceive. However, there is a penalty if you 
should mark out the correct answer. The 
credits and penalty for each question are 
calculated so that you have a 0 expected 
gain due to guessing. For each question, 
you are also asked to circle the one an- 
swer which you feel is most likely correct. 
Of course, if you are able to mark out all 
but one choice, then circle the only remain- 
ing answer. 


The final instruction for circling the single choice 
which is considered most likely correct was needed 
in order that we could have a comparison with other 


Scoring methods. In our comparisons, we have 
treated the choice circled by the examinee for each 
question as the answer which would have been given 
if the instructions had been the usual ones for a mul- 
tiple-choice test; i.e., to select only the correct 
answer for each question. Thetestconsisted of nine 
questions, each with one correct answer and three 
distractors. The maximum credit was 9 points for 
8 of the questions and 12 points for the other. From 
Table 1, we see that the ‘‘fair’’ scores are P=-3, 
Cy=0, С, =1, Со= 3, Сз=9 for the 9-point questions 


and P=-4, Cy=0, Cc, =4/3, Cy=4, Сз =12 for the 


12-point question. Twenty-five students took the ex- 
am. Each question was attempted by every student. 


The scoring methods used in our comparison with 
the partial knowledge procedure given above were: 


(A) Full credit for each correct answer circled, 
with no penalty for guessing. 


(B) Full credit for each question with correct an- 
swer circled but a penalty of -3 for the 9- 
point questions and -4 for the 12-point ques- 
tion for each item with an incorrect answer 
circled. Note that this is the conventional 
scoring formula given by 


S = R-W/(k-1) , 
where R=number of correct responses, W= 
number of incorrect responses, and k=num- 
ber of choices per question. 
(C) s=a агаш } ‚ Where à = 9, or 12 for each 
) . 


мы (k-1 


question, k- 4, №, = number of correct r e- 
sponses to item from entire class, and N de- 
notes the number of examinees attempting the 
question. We note that this score is the one 
which minimizes the mean Square error about 
а true score of unity if the student actually 
knows the correct answer and 0 if the student 
does not know the correct answer (see 1). 
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TABLE 2 


RELATIVE CLASS POSITION OF EXAMINEES FROM FIVE SCORING PROCEDURES 


Scoring Method 


Examinee Number Partial Knowledge (A) (B) (С) (D) 
1 1.5 2 2 2 2 
2 1.5 2 2 2 2 
3 4 8 8 6.5 9.5 
4 4 5 5 9 6 
5 4 5 5 6.5 4.5 
6 6.5 10.5 10.5 4 7.5 
7 6.5 10.5 10.5 13 7.5 
8 8 2 2 2 2 
9 9 8 8 14 и 
10 10 5 5 6.5 4.5 
п п 8 8 6.5 9.5 
12 13.5 15 15.5 12 16 
13 13.5 15 15.5 10 12.5 
14 13.5 15 15.5 11 14,5 
15 13.5 15 12 15.5 12.5 
16 16 18 18 15.5 18 
17 17 23 23 21.5 22.5 
18 18 12 13 17 17 
19 19 20 21 19 19.5 
20 20 15 15.5 18 14.5 
21 21 20 20 23.5 21 
22 22 20 19 20 19.5 
23 23 25 25 25 25 
24 24 23 23 23.5 24 
25 25 23 23 21.5 22.5 


(D) Same as (C) except that the. class was strat- by the procedure given in this paper which allowsfor 
ified on the basis of raw partial knowledge, and also by each of the methodi 
at (A), (B), (C), and (D). The students were 


P In 
ranked from high to low for each scoring method. 


ranks, 
i К case of ties, average ranks were used. The га! 
ommendation (iii), of Arnold (1:11), representing relative scores for each of the Lom 
E i i above are summ; 
Each of the twenty-five Students was then graded Scoring methods given 


ed for each student in Table 2. 


Я 
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It is of interest to note that as a general rule, the 
students who scored quite high with one scoring pro- 
cedure also scored high with each of the others, Al- 
so, the ones who scored quite low with one method 
scored low with each of the other methods, too. How- 
ever, the relative positions for the middle grades dif- 
fer considerably from one scoring method to another. 
We feel that there are at least two reasonable expla- 
nations for this. Firstly, the random guessing fac- 
tor causes the greatest variability in this range, 
Secondly, since scores tend to cluster around the av- 
erage the distance between ranks in the middle is usu- 
ally less than those at either extreme. T his also 
would cause greater variation for the ranks in the 
**middle'' scores. 


COMMENTS 


We have considered a scoring procedure for mul- 
tiple-choice tests where partial knowledge is allowed. 
The proper scores are well defined for any number 
of choices for an item so that there will bea 0 ex- 
pected gain due to guessing. If it is desired to dis- 
courage guessing, one can simply enlarge the penalty 
beyond the ‘‘fair’’ value to any magnitude desired. If 
partial credit is not allowed this procedure reduces 
to the conventional score of R-W/(k-1). The meth- 
od has the practical advantage of being easily scored 
and the instructions are simple to understand. We 
also feel that this method enables more information 
to be extracted with multiple-choice exams. Wehave 
found the method favorably received by students tak- 
ing exams using this scoring procedure. 
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ABSTRACT 


Subjects were high school instrumental music teachers, 


the quality of the Ss’ undergraduate work, 


Grades anda self-rating served as measures of 


Thirty-one teaching effectiveness factors, consisting of pupil perfor- 


mance factors, pupil knowledge of music, band performance factors, and judgments by experts, were selected as 


an indication of teaching success. 


or band performance. 
ing effectiveness. 


THE PURPOSE of this study was to determine 
the effect of work quality in undergraduate courses 
leading to a Music Education degree on effectiveness 
in teaching instrumental music to secondary school 
pupils. 


The Ss were fifty-three West Virginia high school 
instrumental music teachers who were graduates of 
the four largest music schools in West Virginia, They 
had been teaching instrumental music in their pres- 
ent positions at least 3 years, 


MEASURES OF UNDERGRADUATE CURRICULAR 
EXPERIENCES 


Grades were used as a measure of the work qual- 
ity by the S in undergraduate courses. Grades in 
Specific courses were transcribed by direct exami- 
nation of transcripts to sheets which categorized the 
separate courses into the following classifications: 
Music History, Music Theory, Major Performance In- 
strument, Minor Performance Instrument, Music 
Courses, Education Courses, Practice Teaching, Aca- 
demic Courses, andComposite Average of all courses, 


A self-rating on the S's level of performing ability 


Each measure of undergr; 
tiveness factor and relations were inferred from the coefficients. 
uate courses is not related to teaching effectiveness as measured by pupil performance, pupil kno 


aduate work was correlated to each teaching effec- 
It was found that quality of work in undergrad- 
wledge of music, 


Quality of work in some undergraduate courses is related to experts’ judgments of teach- 


was obtained and used as a measure of thequality of 
his undergraduate work. He was asked to rate him- 
self on his major instrument performing ability, tak- 
ing into consideration the highest position attainedin 
a large musical organization, grades received on his 
major performance instrument, and difficulty of works 
that he performed. The self-ratings may have been 
biased by either vanity or humility; however, it can 
be assumed that the Ss have an intimate knowledge 
of how well they performed on their major perfor- 
mance instrument as undergraduates, and that this 
will be reflected in the self-ratings. 


MEASURES OF TEACHING EFFECTIVENESS 


In this study teaching effectiveness is deter mined b 
а description of learning intermsof specific perfor re 
mance abilities and levels of music knowledge which Р 
considered indicative of the possession of that ed 
ing outcome, Thirty-one factors were selected E 
indication of teaching effectiveness and вола ЧО es 
learning Outcomes of pupils. The teaching effec dn 
ness factors were classified into the following fo 


3 u- 
general categories: Pupil Performance Factors, Р ad 


т 5 гу, Ва 
pil Knowledge of Music History and Music Тагогу 
Performance Factors, and Judgments by Experts. 
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Pupil Performance Factors 


The experimenter visited the school in which each 
S was teaching and tape recorded eight pupils chosen 
by the S as representative of his best performers on 
each of the following instruments: flute, clarinet, 
saxophone, baritone, French horn, trumpet. trom- 
bone, and sousaphone. Each pupil was asked to play 
five musical examples that were specifically compos- 
ed by the experimenter to determinethe performance 
level of specific musical abilities that were later 
judged by three faculty members of the Division of 
Music at West Virginia University. Judges ratedthe 
quality of each pupil's performance on a 1-10 scale 
on the following factors: expressiveness of perfor- 
mance, rhythmic accuracy, fast staccato playing, 
fast legato playing, sight-reading, tone quality, and 
overall performance. 


Pupil Knowledge of Music 


After the pupils had performed the examples for 
the experimenter, they were given identical copies 
of a written Music Theory and Music History test. 
The Music Theory test comprised fifteen multiple- 
choice items, Correct answers required that the pu- 
pil know key signature, scales, triad formation, me- 
ters, and rhythms. The Music History test was di- 
vided into four sections. The first was concerned 
with the chronology of composers, the second with 
the identification of performers, the third with iden- 
tification of major compositions, and the fourth with 
identification of major composers. 


Band Performance Factors 


The Ss were high school instrumental conductors. 
Each of their bands performed at either a regional 
or area music festival within a 2-week period during 
the course of the study. The experimenter tape re- 
corded the S's band performance at one of the festi- 
vals. Three college instrumental music directors 
later judged the band performances on the following 
factors: tone quality, intonation, technique, balance, 
interpretation, and overall musical effect. Some of 
the several advantages in using the performance of 
these festival pieces as a measure of teaching effec- 
tiveness follow: 


1. Each band is adequately rehearsed and 
at peak performance level. 


2. Each band is sufficiently motivated by the 
possibility of receiving a superior rating. 


3. Each band performs an adequate variety of 
stylesfrom whichaspects of performance can 
be judged. 


Judgments by Experts 


А composite rating was made of theoverallteach- 
ing effectiveness of each S based on judgments by the 
following people who observed the S's quality of work: 
(1) the chairman of the music department from which 
the S received his undergraduate degree, (2) super- 
intendent in the county where the S was then teaching, 
and (3) the experimenter. For the sake of conve- 
nience, the judges are referred to as experts, It is 
recognized that they are not all expert in each of the 
areas under consideration. They were, however, in 
a position to know the S's effectiveness as a teacher. 
They were asked to consider the degree of success 
of the S's performing groups and the abilities, atti- 


tudes, and knowledge which the pupils had derived 
Írom his instruction. It is noteworthy that there 
tends to be a high degree of agreement among the 
experts in regard to their judgment ratings in this 
study. It appears that common criteria are involv- 
ed despite the difference of viewpoint and background. 


Treatment of Data 


There are nine measures oí the S’s quality of work 
in undergraduate courses and thirty-one teaching ef- 
fectiveness factors. Each measure of undergraduate 
work was correlated to each teaching effectiveness 
factor while holding constant the S's years of teach- 
ing experience and the size of the school in whichhe 
was teaching. The data were analyzed by an IBM 
1620 computer. The program yielded partial соеї- 
ficients of correlation. In this study when a coeffi- 
cient is at or above .265, thus the significance level 
is at or below .05, it is assumed that a relationship 
exists between the variables. In some cases inthis 
study, when a number of correlation coefficients are 
individually nonsignificant in one classification but 
show a common trend, a binomial formula 


QD (г (G)n-r 


was used to give the probability of this particular 
pattern of coefficients. If this probability is less 
than .05, the trend is adjudged as significant. 


The relationship of the Ss’ quality of undergrad- 
uate course work on their effectiveness in public 
school music teaching is inferred from the partial 
correlations and the levels of significance relating 
to those coefficients. 


RESULTS 


The results appear under the following headings 
representing the measures of teaching effectiveness 
used in this study. 


Pupil Performance Factors 


The coefficients of correlation (found in Tables 
1 and 2) between the following measures of under- 
graduate courses and pupil performance factorsare 
not appreciably higher than zero: music theory, ma- 
jor performance instrument, minor performance in- 
struments, education courses, practice teaching, ac- 
ademic courses, music courses, composite average, 
and major performance instrument self-rating. 


Allof the correlations between music history grades 
and pupil performance factors resulted in negative co- 
efficients. This particular trend is part of a pattern 
of negative coefficients when music history grades 
are used as a measure of undergraduate course work, 
The probability of this particular pattern of coefficients 
occurring is less than .01; therefore, the trend is 
adjudged as significant. Likewise, the trend of neg- 
ative coefficients of correlation between education 
courses and pupil performance factors, andbetween 
a composite average of undergraduate gradesand pu- 
pil performance factors, though individually nonsig- 
nificant, are part of a pattern adjudged as significant. 


Pupil Knowledge of Music 


The coefficients of correlation (found in Table 3) 
between the measures of undergraduate course work 
and pupil knowledge of music as measured by scor- 


es on music history and music theory tests are not 
appreciably higher than zero. 
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TABLE 1 


PARTIAL CORRELATIONS BETWEEN UNDERGRADUATE GRADES AND PUPIL PERFORMANCE FACTORS 


ON ALL WIND INSTRUMENTS 


т ' 
19 b ow 
оь оз о з 
be ЗЕ вы os к Hie Cga © ga 
НЕ =ч 2.88 „ВЕ PE: 5555500 859 
85 23 4582 upa 5 5 ЕОЗЗЕ із 
as E3 пав рар ЕЗ 6889382 SJE 
Music History 
Сошсвев -.326 -.248 -.304 -.182 -.216 -.216 -.249 286 -.263 
Music Theory 
Courses . 067 141 .061 1101 2079 ‚167 .096 . 104 - 099 
Major Performance | 
Instrument - 096 -066 — -.039 .019 .020 .037 013 019 008 x 
Minor Performance 
Instrument -.017 0520 -.012 .045 040 .001 032 .021 024 
Education Courses -.224 -.091 -.260 -.097 -.052 -.208 -.140 -. 152 -.149 
Practice Teaching .069 .062 .049 „119 116 .079 . 096 .077 079 
Academic Courses -.047 -01 -.075 020 . 048 .014 028 -.026 -.016 
Music Courses - 046 -005 — -.127 .013 . 000 -051 — -.037  -.027  -.081 | 
Composite Average -.115 ^ ..057 ^ ..1gg -.008 -.015 -.016 -.061 -.069  -.064 
Major Performance In- 
strument Self- 
Rating .183 -014 ‚113 049 - 103 . 107 149 . 096 .100 
The negative coefficients of correlation between perts are not appreciably higher than zero: music 
music history grades and pupil music history or mu- history grades, major performance instrument grad- 
sic theory scores, and between the composite aver- ез, and grades in education courses, Itis noteworthy 
age of undergraduate grades and pupil music history that the coefficient of correlation between music his- 
or music theory scores, though individually nonsig- tory grades and judgments by experts is not negative 
nificant, are part of the pattern of negative coeffi- and thus deviates from the pattern of coefficients that 
cients adjudged as significant. dem pe when music history Grades are 
used as a measure of u с: 
Вапа Регїогтапсе Factors work, ndergraduate course 
The coefficients of correlation (found in Table Though individual} ignifi 
y nonsignificant at the .05 level, 
4) between undergraduate grades and band perfor- the coefficients of cor relatich between the following 
mance factors are not appreciably higher than zero. measures of undergraduate course work and judg- 
Six of eight coefficients that resulted from the cor- ments by experts are appreciably higher than zero: 
relations between the Ss’ self-ratingsontheir major minor performance Instrument grades, practice 
performance instrumentsand band perfurmance fac- teaching grades, grades in academic cour ses, and 
tors are significant at the .05 level. The negative a major Performance instrument self- 
coefficients of correlation between music history rating, 
grades and band performance factors, though indi- 
vidually nonsignificant, are part of the pattern of The coefficients of correlation between each meas~ 
negative coefficients adjudged as significant. ure of undergraduate Course work and each classifi- 
cation of teaching effectiveness factors suggest that 
Judgments by Experts d Criteria used by experts to judge teaching eae 
ive i c 
The largest coefficients of correlation (found in es of the Se 10 Dased solely on the learning take 
Table 5) inthis study occurredbetween the following t E de m Pupils. Though the experts Ba parent- 
measures of undergraduate course work and judgments іу ara sideration these outcomes, factors aon in- 
j i i by e ts: i y Operating on the teaching situation whi yi 
of the Ss' teaching effectiveness by ‘Xperts: music fluen ut disci 
қ Я Се their judgments, such as classroom di 
theory grades, grades in music courses, and a Pline, personal attractiveness of the teacher; 
2 , 
composite average of undergraduate grades. alertness of the pupils in the classroom, pupil mo 
iei i Tale, and possibly the economic environment. 
The coefficients of correlation between the fol- ч т ре 
lowing measures of undergraduate course work and of these foregoing factors and many others шау 


judgments of the Ss’ teaching effectiveness by ex- 


considered by the experts in the light of their 09/0 
educational viewpoints. 
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TABLE 3 TABLE 5 
i I UN Е z IONS BETWEEN UNDER- 
TIAL CORRELATIONS BETWEEN UNDER PARTIAL CORRELAT ! 
ТАЕ GRADES AND PUPIL KNOWLEDGE GRADUATE GRADES AND JUDGMENTS BY 
OF MUSIC EXPERTS 
Pupil Music Pupil Mu- Judgments by 
History Scores sic Theory Experts 
Scores 
Music History Music History Courses 2180 
Courses -. 016 -. 158 
Music Theory Courses . 385 
Music Theory 
Courses -.099 -.181 Major Performance Instrument .145 
Major Performance Minor Performance Instrument . 248 
Instrument -.129 .059 
Education Courses .097 
Minor Performance 
Instrument -.071 .016 Practice Teaching ‚251 
Education Courses .062 -.101 Academic Courses .231 
Practice Teaching -.045 ‚119 Music Courses ‚345 
Academic Courses .072 . 097 Composite Average .381 
Music Cour. -. - Ы 
МӘ SOMESES 192 014 Major Performance Instrument 
Composite Average -.069 -.029 Sclt-Rating ы 


Major Performance 
Instrument Self- 


Rating NT oia CONCLUSIONS 


Quality of work in music history courses is not 


TABLE 4 


PARTIAL CORRELATIONS BETWEEN UNDERGRADUATE GRADES AND BAND PERFORMANCE FACTORS 


= E] E ям ЧӨ a 
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Music History Courses -. 153 -.214  -.121 -.134 -.191 2,155 -.162 -, 160 
Music Theory Courses 156 .133 — 1157 . 206 . 198 194 183 -168 
Major Performance Instrument . 019 .027 .033 .013 126 022 047 .029 
Minor Performance Instrument . 020 . 055 .027 . 074 086 066 057 .058 
Education Courses 7.010 -.006 .041 .033 .020 010 008 . 006 
Practice Teaching .062 -.047  -.038 -.007 029 -.008 004 -.011 
Academic Courses . 037 =, 012 . 056 .035 062 049 041 . 030 
Music Courses .128 -100 1115 „235 156 136 134 .128 
Composite Average - 090 -060 111 - 088 129 106 . 102 . 096 

Major Performance In- 

strument Self-Rating . 334 . 287 .248 238 304 288 294 ‚291 
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TABLE 6 


PARTIAL CORRELATIONS BETWEEN UNDER- 
GRADUATE GRADES AND COMPOSITE SCORE 
OF PUPIL ACHIEVEMENT SCORES 


Composite 
Score 
Music History Courses -.201 
Music Theory Courses .002 
Major Performance Instrument -.006 
Minor Performance Instrument .020 
Education Courses -.058 
Practice Teaching .051 
Academic Courses .067 
Music Courses -.036 
Composite Average -.020 
Major Performance Instrument 
Self-Rating .121 


related to teaching effectiveness as measured by 
judgments by experts. It is inversely related to 
pupil performance factors, pupil knowledge of mu- 
sic theory and music history, and band performance 
factors. 


Quality of work in music theory courses, all mu- 
Sic courses taken as an average, and all undergrad- 
uate courses leading to a Music Education degree 
taken as a composite average is related to teaching 
effectiveness as measured by judgments by experts. 
Quality of work in these courses is not related to 
pupil performance factors, pupil knowledge of music 
theory or music history, or band performance fac- 
tors. 


Quality of work in undergraduate courses, as 
measured by undergraduate grades and a major per- 
formance instrument self-rating, is not related to a 
composite score of pupil achievement scores, as ev- 
idenced by the size of the coefficients of correlation 
found in Table 6. 


Quality of work on the Ss’ major performance in- 
strument, on the minor performance instruments, 
in practice teaching, and in the academic non-mu- 
sic courses is not related to teaching effectiveness 
as measured by the teaching effectiveness factors 
of this study. 


Quality of work on the major performance instru- 
ment, as measured by a self-rating, is related to 
teaching effectiveness as measured by band perfor- 
mance factors. It is not related to judgments by ex- 
perts, pupil performance factors, and pupil knowl- 
edge of music history or music theory. Incomparing 
the results of the correlations between the major per 
formance instrument self-rating and band perfor- 
mance factors with those that occurred between ma- 
jor performance instrument grades and band 
performance factors, it appears that the level of 
proficiency on the major performance instrument as 
viewed by the S may not be the same as the level of 
proficiency as viewed by his teacher. 
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EFFECT OF DETAILED GUIDANCE ON THE 


WRITING EFFICIENCY OF COLLEGE FRESHMEN 


PERRY R. CHILDERS and VIRGINIA J. HAAS 
The University of Wisconsin- Milwaukee 


ABSTRACT 


This study was designed to determine the effecti 
detailed correction and comment on Specific prelimin: 


students enrolled in two classes. One group received detailed guidance, the other did not. 


all papers; a systematic sampling of 19 percent of 
these was graded by the second rater. The inter-rater reliability coef! 


The hypothesis-there will be no significant differen 
ond semester freshman English students on the research 
was not rejected. The conclusion was drawn that, within 
were not validated by the classroom instructor as a mean. 


search paper. 


THE EFFICACY of the college freshman En- 
glish course at the University of Wisconsin-Milwaukee 
(UWM)-and at almost any university across the 
country, apparently (9)-is consistently questioned, 
often by the very people who teach the course. The 
practices questioned range from the rationale for 
impromptus to the reliability of the grading practic- 
es of individual instructors. 


This investigation was prompted by concern about 
the practice of correcting and commenting on stu- 
dents' written work at length, especially the major 
documented paper required by the departmentin the 
second semester course. The question almost in- 
evitably formed by the instructor was whether or not 
careful, precise marking of mechanical, Structural, 
and formal errors on outlines and rough drafts, as 
well as lengthy end comment and individual confer- 
ences, would actually result in an improved final 
draft of the research paper. Obviously, careful, а1- 
most picayunish, attention to every possible error 
the instructor can find as well as one or two para- 


Also included was a cross~ 
The study involved the research papers of forty-eight 
An evaluation chart, 


ficient (Pearson r) was established as .96- 


ce in the grade achievement of Separate sections of sec- 
paper according to different instructional procedures- 
the limited scope of this study, the described methods 
s of improving the performance of students on the re- 


graphs of thoughtful comment on ideas and presenta- 
tion of those ideas involves many hours of work for 
the instructor who is handling forty-five to fifty-five 
papers of perhaps eight to ten pages each. Equally 
obvious is the fact that it requires several hours of 
each student’s time to produce the outline and rough 
draft and then to carefully note and attempt to refine 
а paper in terms of whatever comments seem valu- 
able. In many cases, the process can also involve 
conferences between the instructor and student a 
submission of the outline and again after the roug 
draft is submitted. 


It should be noted that the National Council of таш 
ers of English has pointed out that too little is ee o! 
about composition teaching itself and that the Ho Tiv 
research on composition is so slight and so prr С? 
that it serves teachers of composition poorly (1)- hat 
surprisingly then, this investigation discovered ШЕР, 
extant research offered а body of information W est" 
clarified some portions of the question while SUE е 
ing conflict in other aspects of it. For example, 
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Buxton (2) study indicates that college freshmen 
whose writing is thoroughly evaluated, criticized, and 
revised improve their writing more than their peers 
whose essays are not handled in this way. An ex- 
periment by Kincaid (10) showed significant variation 
in the quality of writing of freshmen ranked in the 
bottom quarter of his sample according to the theme 
assignment. And any number of studies, including 
those of Dye and Bledsoe (7), Cast (3), Fostvedt 
(8), Kincaid (10), Buxton (2), and Thompson (11), 
indicate that variations in grading practices signif- 
icantly affect the mark a student receives on a piece 
of work or a semester's production; this mark, of 
course, must serve as a measure of the student's 
achievement in written work. 


А review of studies which indicated that a multi- 
tude of variables can influence grade achievement by 
freshman English students and observations infresh- 
man English sections suggested the following null 
hypothesis be tested: 


There will be no significant difference in the 
grade achievement of separate sections of 
second semester freshman English students 
on the documented research paper following 
different instructional procedures. 


METHOD 


The sample consisted of forty-eight students en- 
rolled in second semester (English 102) English at 
UWM during the spring of 1969. They were enroll- 
ed in two sections. One section, Writing Group X, 
had twenty-five students; the other, Writing Group 
Y, twenty-three. 


Kincaid's study (10) offers the precedent by which 
university registration procedures were relied on 
to insure that the sample was representative of this 
segment of the university population. Freshman reg- 
istration is restricted to special days in the fall and 
spring term because of the massive numbersof fresh- 
man students. Each freshman registration form is 
stamped with a sequential number as it is received. 
Then the forms are collated according to preferred 
days and hours for freshman English, with the first 
received being placed in first section choices. Since 
the English department specifies a maximum of twenty- 
seven students in 102 sections, no morethanthisare 
assigned to sections beginning with the first section 
number for the specific period listed. Balance in 
number of students enrolled in a section is sought, 
but no attempt is made todistribute students accord- 
ing to any other standard. English 102 (which is re- 
quired) includes those students who received at least 
а D in English 101. Those who receive an A in 101 
are exempt from 102 but may take it if they wish. 
The preceding factors indicated the use of a system- 
atic sampling (5) to obtain papers which were grad- 
ed by a second rater. 


Groups X and Y were given a schedule of deadlin- 
es for the research paper on the first day of class. 
All Ss were required to present a tentative topic dur- 
ing the third week of class. Students were asked to 
see the instructor for a conference or to submit a 
written statement if they preferred. The instructor 
and student either individually discussed narrowing 
of subjects, initial references, likely areas of diffi- 
culty and so forth, or the instructor returned similar 
suggestions in writing along withthe S's declaration. 


АП Ss were asked to buy the English department 
pamphlet on the research paper and read it as well 
as two chapters on the research paper inthe fresh- 
man rhetoric text. One 50-minute class period was 
devoted to an explanation of research paper formand 
purpose and to the technicalities of documentation. 
This was done onthe same day for botn groups. 


Both groups handed in a short documented essay, 
the primary emphasis of which was the proper foot- 
note and bibliography form, the incorporation of source 
material into the text, and documentation practices 
which would preclude any suggestion of plagiarism. 
Ss in Group X were advised, in marginal and endcom- 
ment, of problem areas and were simply referred to 
pertinent pages in the pamphlet and rhetoric text. 
The papers of Ss in Group Y were given extensive 
marginal comment and suggestions, and those Ss with 
serious difficulties in coping with these procedures 
were called in for conferences. 


During the fourth week of classes, Ss in Group X 
were informed that they would be required to submit 
nothing more than their final draft on the due date. 
They were also asked to turn in their note cards with 
their papers. Group Y continued to follow the origi- 
nal schedule, which required that note cards from at 
least three sources be turned in during this week. 
These cards were checked by the instructor to see if 
material was quoted properly, if the students were us- 
ing fruitful sources, if the references pertained to 
the student's announced topic and thesis. Comments 
and questions were noted on the cards, and conferenc- 
es were held at the request of students. 


In the seventh week Group Y turned in sentence out- 
lines of their proposed papers. These were readfor 
content, organization, logic in the presentation of ideas, 
and for the inclusion of extraneous material. Nograd- 
es were given. At this time the instructor requested 
conferences with specific students, of whom all but 
two responded. 


At the end of the eighth week Group Y submitted 
rough drafts of their papers. These copies were read, 
by the instructor only, for all of the criteria that were 
checked in the final paper (see Figure 1), At least 
one paragraph-but sometimes two or more-of end 
comment indicated student strengths and weaknesses 
in rhetoric and form. Marginal comments were gen- 
erally directed to pointing out specific instances of 
mechanical, formal, or structural error, although 
rhetorical considerations were not excluded from 
them. Individual conferences for some students in 
Group X were recommended. Conferences were also 
held with those members of the group who requested 
them. 


Duringthe twelfth weekboth groups submitted their 
papers. At the deadline, forty-five were turned in; 
three were submitted within the next three class periods. 


In addition to the major documented paper, each 
group wrote seven essays. The assignments were 
the same for both groups. With the exception of the 
single theme already mentioned, none of the others 
required documentation or bibliography although some 
VER де a matter of choice wrote papers which in- 
corporated source material. The essays requir 
tobe the standard 500 words long. Studente were nes 
1-10 days between time of assignment and due date of the 
theme except for two impromptu essays, 
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FIGURE 1 


EVALUATION CHART (4) 


RHETORIC 


Control and Coherence: careful, complete de- 
velopment of individual’s stated thesis; no ex- 
traneous material 


Logic: understanding of content and presenta- 
tion of it in an orderly way 


Diction: appropriate word choices; effective 
level of usage; apt title 


DOCUMENTATION 


Reference Material: fairly wide selection of 
appropriate sources; pertinent selection of 
quotations and material 


Incorporation: smooth integration of source 
material into student’s text 


Form: adherence to documentation require- 
ments, footnote, and bibliography form as 
outlined in the department pamphlet 


GRAMMAR AND STRUCTURE 


Sentences: use of syntactically accurate, 
textured, appropriately varies ones 


Paragraphs: use of orderly, unified, appro- 
priately developed generative structures 


Spelling, Punctuation, Transitions, Intro- 
duction, and Conclusion: use of techniques 
consistent with department standards and/ 
or intent and meaning of writer 


COMMENTS 


TOTAL ALL SECTIONS 


NUMERICAL GRADE 


LETTER GRADE 


NAME: 


Reliability in grading the research paper was 
sought according to findings of prior experiments, 
Cast (3) and Buxton (2) suggested, respectively, 
that a weighted scale is most feasible for maintain- 
ing high reliability between raters, and that raters 
who develop the evaluative criteria together are likely 
to establish a high reliabilty rating. Additionally, the 
experiments of Fostvedt (8) and Cast (2), as well as 
the Work of Diederich (6), indicate the criteria most 
often relied on by raters of composition. 
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A second graduate teaching assistant, who came to 
UWM at the same time as the classroom instructor, 
was asked to cooperate in establishing an evaluation 
chart. The final jointly-developed chart was based 
on criteria selected from prior research, the partic- 
ular nature of the assignment, and a standard depart- 
mental rating chart which is distributed to teaching 
assistants as a suggested guide. 


Each of the three major divisions on this c hart 
could receive a maximum of twelve points. Each sub- 
division was rated from 0-4, failing to superior. Both 
raters agreed that this system would allow any single 
student to excel enough in one area to achieveat least 
а passing grade on the paper and thus help to diminish 
the effects of any bias in the instructor's rat ings. 
Since the students could not be anonymous for the in- 
Structor as they were for the other rater. 


The research papers were submitted in no partic- 
ular order. The three later papers were inserted in- 
to the proper stack by an uninvolved individual. ) А 
systematic sampling of every fifth paper (approxi- 
mately 19 percent of the total) was abstracted. The 
groups were not mixed, and the sampling consisted of 
four papers from one group and five from another. 
The volunteer rater did not know to which group the 
papers belonged. All the papers were then rated by 
the instructor. The inter-rater reliability coefficient 
was established as .96. 


RESULTS 


The null hypothesis was tested by means of the t- 
test, with the .05 level of significance set for rejec- 
tion of that hypothesis. 


The second analysis concerned differences which 
existed in the achievement of Groups X and Y on the 
regular essays. 


Finally, using the t-test again, the achievement of 
Group X on the research paper and regular essays was 
compared; the same procedure was repe 


The t- value established 
on the research pape 
Íreedom- not signific 
not rejected. 


for Group X and Group Y 
r was .7737 with 46 degrees of 
ant. The null hypothesis was 


Further, the t-value 


for Groups X and Y on their 
regular essays was .95 


67, again not significant. 


The within-group comparison yielded a t-value for 
Group X on the research paper and regular essays of 
.7398 with 48 degrees of freedom, not a significant dif- 


ference. The value for Group Y was 1,1587-not sig- 
nificant. 


The tables shouldbe studied with thisin mind: the 
first semester is entirely devoted to the single short 
essay. Ten essays are required during the first Se^ 
mester, and classroom discussion and outside read- ë 
ing are directed to techniques of producing acceptabl 
short essays. It can also be noted that there is ana 
most exact reversal in the mean and standard devia. a 
tion figures for Groups X and Y on the regular essay 
and the research papers (see Table 1). 


CONC LUSIONS 


à A it was 
Since the null hypothesis was not rejected, it Wes 
concluded that the described process of periodic 


atedfor Group Y. 


\ 
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TABLE 1 


COMPARISON OF ACHIEVEMENT ON THE RESEARCH PAPER AND REGULAR ESSAYS BETWEEN GROUPS 


Writing Group X 


Writing Group Y 


N x 5 N x 5 а ї-уаше 
Research Paper 25 5. 64 3.26 23 6.39 3.45 46 7713, 
Regular Essays 25 6. 04 2.41 23 5.39 2.29 46 . 957 
pervision and extensive correction and comment was REFERENCES 


not validated as a means of aiding students to produce 
a more effective research paper than their peers, 
who have not received such supervision and comment. 


The additional between-group test for a signifi- 
cant difference in achievement on the regular essays 
was carried out in an attempt to determine whether 
a difference in the type of assignments would affect 
either of the writing groups. Since this difference 
was not significant, it was concluded that a possible 
major variable was minimized. The comparisonal- 
so indicated that neither group was a significantly 
superior writing group (see Table 2). 


TABLE 2 


COMPARISON OF ACHIEVEMENT ON THE 
RESEARCH PAPER AND REGULAR ESSAYS 
WITHIN GROUPS 


Writing Group X Writing Group Y 


Research Regular Research Regular 
Papers Essays Papers Essays 

N 25 25 23 23 

X 5.64 6. 04 6.39 5.39 

5 3.26 2.41 3.45 2.29 

а 48 44 

t-value . 740 1.159 

DISC USSION 


A greater number of raters would have been de- 
sirable in this experiment because the nature of the 
procedures precluded anonymity for the student- 
writers as far as the instructor was concerned. How- 
ever, the extremely high reliability rating coefficient 
indicates, for these samplesat least, that a variation 
in grade achievement because of grading practices 
Or bias was minimized. Peripherally, this high cor- 
respondence agrees with the findings of prior research 
which states that high reliabilty between raters can 
be achieved when raters select criteria together and 
are familiar with the application of chosen standards. 
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TWO-FACTOR EXPLANATION OF POST-HIGH 


SCHOOL DESTINATIONS IN HAWAII 


PAUL W. DIXON 
University of Hawaii, Hilo Campus 


ABSTRACT 


Data from 484 students out of a Hawaiian high school 
high school destinations. Of these 643 Students, 360 were 
were going either to universities, a regional 2-year branc 
leges, or technical schools. A factor analysis of ten vari 
behavior, verbal (V) and quantitative (Q) School and Coll 
uating class (R) revealed two factors: “g a general inte 
(teachers' ratings and R). In general the institutions of 
nificantly higher on the variables of both factors. Varia 
of ego self-direction and social control as it affected mo 


ATTENTION should be given to the personality 
and academic characteristics of high school students, 
since they may affect students’ choice of activity af- 


ter high school, Students, counselors, and parents 
would find information on the characteristics vari- 


able valuable, since this would help them to predict 


the relationship between the students’ performance 


in high school and his probable choice of destination 
after high school. Since post-high school destination 
choice is usually the first major decision influencing 
lifetime goals, it is important that the students and 
their advisors have this kind of information available. 


The setting for this study is a‘‘natural laboratory" 
variety, since the high school is located on one of 
the islands of the Hawaiian Island chain Which has 
limited post-high school educational Options. Stu- 
dents must make a relatively Clear-cutdecisionbe- 


tween their home environment— with a particular kind 
of scholastic opportunity available—and a scholastic 


environment which imposes a geographical separa- 


tion from their immediate home. The choice of post- 
high school destination is limited to attending a tech- 
nical school or a 2-year campus of a state university 
if they remain on the home island or attending a uni- 


versity or 4-year college if they leave the island, 


Seibel (9) found thatabout 40 to 49 percent of low 
ability (lower quarter of the distribution) high school 
seniors attending college choose 2-year institutions, 


NOBUKO K. FUKUDA 
University of Hawaii, Hilo Campus 


ANNE E. BERENS 
University of Hawaii 


graduating class of 643 were used in a study of post- 
found to have selected academic destinations, They 
h of the state university, 4-year colleges, junior col- 
ables consisting of teachers’ ratings of school related 
ege Abilities Tests (SCAT) scores, and rank in grad- 


quarter) group where 90 percent attended 4-year in- 
stitutions and only 10 percent attended junior colleg- 
ез. Berdie (1) found that for students who plan to 
attend college mean American College Entrance 


of college (1. e., 
predictive of plans for college attendance. 


À related study was performed by Richards and 
Braskamp (7) which analyzed the differences be- З 
tween students attending 2-year and 4-year сапер { 
Their findings indicated that students who attende 
2-year campuses (1) tended to be less able аса" 
demically than those attending 4-year colleges оп 
both the aptitude (APT) testandon high school grade, 
point averages; (2) varied more in academic Ed 
than did students in 4-year colleges; and (3) һа t 
fewer nonacademic accomplishments except in 24) 
than did 4-year college students. Thus, =й in 
Colleges attract pragmatic students seeking ione 
tional training; they are less attractive to talen ae 
Students, or intellectually and academically oe A 
ed, who plan a degree in one of the traditional 50 va^ 
ject areas, and who expect to take part in a Mn m 
riety of activities in college.'' They conclude 0215 
junior-college students typically have differenté' 
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than students attending 4-year colleges. The student 
attending a 2-year campusis more pragmatreallyori- 
ented and desires more technical instruction in ma 
parison to the student attending a 4-year campus who 
is looking for more academic, intellectual stimulation. 


A factor analytic study by Richards and Holland 
(8) indicated that there were four major areas of 
influence which determine college choice: intellec- 
tual emphasis, practicality, advice of others, and 
social emphasis. They also found that these four 
areas are highly similar for men and women. The 
differences in college choice are related to aspects 
of the institution which the student is interested in 
attending. 


A factor analysis of SCAT V and Q, teachers’ 
ratings, and rank in high school class was perform- 
ed in order to elucidate the simple structure among 
all variables. This study attempts to analyze the re- 
lationships which obtain between school-generated 
measures and post-high school destinations. Com- 
parisons were made for 
dent to show the value of 
made between males (M) and females (F) who at- 
tended the 2-year branch 
Hilo Campus on the island of Hawaii (UHHC), or who 
went to the main branch 
land of Oahu (UHM). 
between those attending UHM and those 


ing college (COLL) and students attending junior col- 
lege both in Hawaii and on the mainland (ALL JC ). 
Another comparison was made between thoseattend- 
ing the 2-year branch of UHHC and those attending 
junior college elsewhere (ALL JC). A similar com- 
parison was made between those attending the 2-year 
campus of UHHC and those attending universities on 
the mainland. A similar comparison was made be- 
tween those attending the 2-year UHHC campus and 
those attending colleges on the mainland. An overall 
comparison was made between technical school stu- 
dents (TECH) and all college and university students 
(ALL COLL). Another comparison was made be- 
tween males attending technical school and males at- 
tending colleges and universities. Acomparison was 
also made between females attending technical schools 
and females attending colleges and universities, 


METHOD 
Subjects 


The students were members of the graduating class 
of a large Hawaiian high school. Those students who 
were selected for study had complete records on 
measures of SCAT, Teachers’ Ratings, and Rank in 
Class (R). Froma graduating class of 643, 484 
were reached in the study the semester after June 
graduation. The scores of these students wereused 
to obtain the factors by means of an oblique solution 
to factor rotation. Of those reached, 360 had post- 
high school destinations of an academic nature, The 


scores from these students were used in the compar- 


isons between students attending universities, A 
leges, junior colleges, and technical schools. ud 
instrumentation and procedure follow those desc 

ed in an earlier publication (4). 


RESULTS 


The subroutines ( PERSUB) of Bottenberg and 
Ward (2) and an IBM 7040 computer were used to " 
determine the extent to which post-high School des 4 
tination and sex variables contributed to Muere 
in SCAT, Teachers' Ratings, and R. гене 
the statistical techniques of multiple linear ZOE z 
sion to determine F ratios and exact probability Ms 
ues to 4-place accuracy. It presents an ӨНЕДІ 
method for programming computation of the ne 
mated probability for any specific F value. Thee 
timate is based on the actual distribution of scor i 
and may be used with df ranging from 4 to 1,000 (6). 
The F ratio is computed between a full model re- 
gression equation containing all predictor variables 
under consideration and a restricted model. 


In this analysis the information about a патио. 
lar variable is not included in the restricted mode's 
equation, and the predictive efficiency of this equa 
tion is compared with the predictive efficiency of 
the full model’s equation, in which all the informa- 
tion for each variable is included (for complete dis- 
cussion, see 4), For example, in predicting SCAT 
V scores the full model would include the fem ale 
UHHC and female UHM differences in the population, 
while the restricted model omits the information that 
different destinations for females existed, If knowl- 
edge of destination differences helped in the predic- 
tion of SCAT V scores, there would be a significant 
difference using the F ratio statistic, between the 
full model equation’s R2 containing destination dif- 
ferences, and the restricted model’s R? which does 
not include this information, The hypothesis tested 
is that members of the sample have different desti- 
nations but the same expected score on SCAT V, 


Rotation was by means of Digman’s (3) method, 
which is essentially a variant of the Harris-Kaiser 
system (5). In this type of factor 
agonals are 
ining diagonals so that the values in the di- 
agonals are the variance which these diagonals have 
in common with the other variables. Therefore, the 
values in the diagonals are based on the data rather 
than on the assumed value that the diagonals should 


have. A 10x10 variance-covariance matrix was ro- 
tated by this method, 


The correlations among the ten factors are show" 


in Table 1; two factors were found to have an eigen" 
value 71.0. The correlation matrix was rotated bY 
the Vari 


max procedure with the results shown in " 
This table presents the factor loadings aC 
an oblique solution, No allowance was 
made for differences between sexes in this analysis 
This follows the work of Richards and Holland (8): 
who found no differences due to sex in their factor 
analysis of major influences in college choice from 


a27-item questionnaire describing influences on co 
lege choice, 


Table 2, 
Cording to 


, Table 3 shows the intercorrelations (factor ded 
Sines) between the three factors derived from th é 
factor analysis, Table 3 shows that a high positiv" 


e 
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TABLE 1 


FUKUDA, 


and BERENS 27 


INTERCORRELATIONS BETWEEN SCAT V AND Q, TEACHERS’ RATINGS, AND R 


i 2 3 4 5 6 7 8 9 10 
1=SCAT V 1.00 0. 67 0.51 0. 48 0.48 0. 53 0. 50 0. 45 0.45 -0. 56 
2-5САТО 1.00 0.54 0.51 0.52 0. 52 0. 55 0. 47 0. 42 -0. 58 
3-ACCU 1.00 0. 85 0. 89 0. 88 0.87 0. 78 0.74 -0. 77 
4-COOP 1.00 0. 93 0. 89 0. 93 0. 81 0. 78 -0. 72 
5=E-I 1.00 0. 92 0. 94 0. 82 0.76 -0. 76 
6=I-L 1.00 0. 91 0. 77 0. 81 -0. 74 
7=R-R 1.00 0. 84 0. 72 -0. 76 
8SspPep 1.00 0. 64 -0. 67 
9-5-Е 1.00 -0. 59 
10=R 1.00 


Note: Negative correlations between R (10) and other scores are the result of smaller R value showing higher 


rank in class. 


relationship obtains between the two 
factors. 


An analysis was performed using the statistical 
technique of multiple linear regression comparing 
students with different destinations after high school 
on the same ten variables. Table 4 shows the list of 
means for the comparisons between the 2-year branch 
of the university and students in other colleges and 
universities. It shows comparisons between students 
in the island university system and compares them 
with college students on the mainland, students at- 
tending MNLD JC, and students attending HAWAII 
JC. A similar comparison is made between HAWAII 
COLL and MNLD COLL students. ALL COLL stu- 
dents are compared with ALL JC students. A com- 
parison is also made between University of Hawaii 
Students at the 2-year campus (UHHC) and those 
attending universities on the mainland (MNLD UNIV), 
and a similar comparison is made between students 
attending the 2-year campus of the university (UHHC) 
and those attending colleges on the mainland (MNLD 
COLL). An overali comparison is made between 
those attending technical school (TECH) and those 
attending all colleges and universities (COLL). This 
analysis compared males in colleges and universi" 
ties (M ALL COLL) and males attending technical 
Schools (MALE TECH). Similarly, female techni- 
Cal school students (FEMALE TECH) and females 
Attending colleges and universities are compared (F 
us COLL). Table 5 presents all the comparisons 

ed here giving R-square, df, and F ratios. 


DISCUSSION 


$ Table 2 shows the two factors that were obtained 
Тот an oblique analysis of the correlation matrix 


TABLE 2 


OBLIQUE LOADINGS OF TEN VARIABLES ON TWO 
FACTORS 


School Skills 8g 
1 2 
1-SCAT V -0.12 0.62x 
2-5САТ О - 0.06 0.57 x 
3= ACCU 0. 48 x 0. 07 
4=COOP 0.57 x -0. 03 
5-E-I 0.56x -0. 02 
6-I-L 0.52x 0. 03 
7-R-R 0. 53 x 0.01 
8-Р-Р 0.53х 0. 01 
9=S-F 0. 53 х -0.01 
10-R -0.31x -0. 26 


x meets criterion + .30 


found in Table 1. Using the criterion that factor- 
loadings must be greater than + .30 to be consider- 
ed meaningful, it was found that on Factor 1 teachers’ 
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ratings and R load positively. (Since the scale for 
Ris reversed, a negative value shows positive pre- 
diction.) Factor 2 appears to be a “g” or general 
intellective factor, since only SCAT V and Q meet 
the criterion of loading weight on that factor. It is 
easily described, since it shows intellective орега- 
tions. Factor 1 may be described as а school skills 
variable, since teachers’ ratings and R weigh posi- 
tively on Factor 1. 


TABLE 3 


FACTOR CORRELATIONS 


the other factor. The ability level of a male але. 
would be predictive of transfer to the main АШНЫ 
the university, while high school performance would 
not be important. The significant difference in rank 
in class in favor of those attending the main univer- 

Sity campus would therefore be due to differences ib 
intellectual ability rather than in school-related per- 
Sonality characteristics. 


Another comparison was made between students 
attending the main campus of the university and those 
attending college on the mainland. In this case the 
major determinant was a Significant difference in rank 
in class in favor of those going to mainland universi- 


School Skills ы ы 

1 2 
1 1.00 +0. 84 
2 1.00 


Dixon and others (4) have shown that some teach- 
ers’ ratings which are related to Factor 1 predict 


Not all 


dustry (Е-1), and Reliability and Responsibility (R - 


ion to pre- 
rating var- 
iables are more salient in prediction of high school 


From theanalysis of thedifferences in SCAT scores, 
teachers' ratings, and rank in class, we find a pat- 
tern of differences which show a rather uniform set 
of mean differences among individuals going to var- 
ious post-high schooldestinations. Thus, for femal- 
es attending the major university in the State as com- 
pared with females attending the 2-year-branch of the 
university, there are significant differences in the 
means for all variables showing that the former group 
has significantly higher scores on SCAT V and Q, 
teachers’ ratings, and R. For males a somewhat 
different pattern can be seen, since males going to 
the main university campus are Significantly higher 
on SCAT scores and rank in class but do not differ 
significantly from those attending the local branch 
of the university on teachers’ ratings. Females in 
the first comparison would thus differ on both fac- 
tors. Allfactors would contribute to differences in 
personality and intellectual profiles of females at- 
tending UHHC as compared with those at- 

tending the main university campus. Males 
differ only on Factor 2, the “g” factor, and not on 


ties. Intellectual ability and school-related person- 
ality characteristics would therefore notbe important 
аз far as post-high school destination is concerned іп 
this instance. Actual performance in high school is, 
therefore, the best measure for determining post- 
high school destination for Students who are leaving 
the immediate vicinity to go to colleges on the main- 
land. 


А comparison was made between students attend- 
ing junior colleges on the mainland and junior colleg- 
ез in Hawaii, excluding the 2-year branch(UHHC). 
Here no significant differences were found. Another 
Comparison was made between s tudents attending 
HAWAII COLL and those attending MNLD COLL. 
Again no Significant differences were obtained in this 
Comparison. A further comparison was made between 
those attending COLL and those attending ALL JC 
excluding the 2-year branch(UHHC). In this com- 
рагїзоп no significant differences were ob tained. 
Thus, these students are not distinguishable to a sig- 
nificant degree on the measures which were obtained 
even though the means for the students attending 4- 
year colleges are in all cases above the means of the 
Students attending junior Colleges. The lack of sig- 
nificant difference here may be due to the small num- 
ber of subjects available for these Comparisons. 


Students attendi 
(UHHC) were compared with Students attending junior 


ainland. The result 
2-year branch of UHHC 
y higher mean Scores on the variables 
The significant differences 


Similarly, a Comparison between university stu- 
dents attending the -year branch (UH H C? and those 
attending universities on the mainland revealed a uni- 
form superiority on all variables in favor of students 
who attend mainland universities except for SCAT ©. 
Selection Procedures at mainland universities seem 
to favor the more highly verbal student. The students 
at the 2-year UHHC did Significantly better on SCAT 
Q. Quantitative abilities are thus related to lack of _ 
mobility toward mainland university attendance as co™ 
рагей with all other variables, This is the case ехе! 
though SCAT V and Q are positively correlated to 
there degree (г=+ .67). Quantitative abilities may 
therefore relate to a Constellation of personality а 
tributes which are negatively related to mobility. 
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TABLE 5 


COMPARISON OF ALL GROUPS SHOWING R’, df, F, and P VALUES FOR EACH COMPARISON 


в? 


аг F 
R FUHM UHHC 0. 20 1/93 еы 
SCAT V FUHM UHHC 0.17 1/93 19. явке 
SCAT О FUHM UHHC 0.10 1/93 10, 36** 
ACCU FUHM UHHC 0.16 1/93 18. 39** 
СООР FUHM UHHC 0.12 1/93 13. 38** 
Е-1 FUHM UHHC 0.16 1/93 19. 37** 
I-L FUHM UHHC 0.14 1/93 15, 89** 
R-R FUHM UHHC 0.13 1/93 13.97** 
P-P FUHM UHHC 0.14 1/93 15. 09** 
SELF FUHM UHHC 0. 08 1/93 т. 13** 
R MUHM UHHC 0. 04 1/81 3. 82* 
SCAT V MUHM UHHC 0. 09 1/81 T,81** 
SCAT О MUHM UHHC 0.11 1/81 10. 49** 
ACCU MUHM UHHC 0. 02 1/81 1.52 
СООР MUHM UHHC 0. 03 1/81 2.26 
Е-І MUHM UHHC 0. 02 1/81 1.82 
I-L MUHM UHHC 0. 02 1/81 1, 82 
R-R MUHM UHHC 0.01 1/81 297 
p-p MUHM UHHC 0. 02 1/81 1.32 
SELF MUHM UHHC 0. 02 1/81 1.95 
R UHM MNLD 0. 09 1/41 3. 95* 
SCAT V UHM MNLD 0. 08 1/41 3. 95 
SCAT О UHM MNLD 0. 05 1/41 2.17 
ХӨ UHM MNLD 0. 06 1/41 2.64 
СООР UHM MNLD 0. 06 1/41 2. 61 
E-I UHM MNLD 0. 06 1/41 @ di 
bb UHM MNLD 0. 08 1/41 3. 42 
R-R UHM MNLD 0. 04 1/41 1.59 
P-P UHM MNLD 0. 05 1/41 2.18 
SELF UHM MNLD 0.07 1/41 2.99 
R HAW MNL JC 0. 00 1/11 0.01 
Table 5 is continued on following page. 
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TABLE 5 (Continued from previous page) 


R? а Е 
SCAT V HAW MNL JC 0.01 1/11 0.09 
SCAT Q HAW MNL JC 0.01 1/11 0.07 
ACCU HAW MNL JC 0.00 1/11 0.04 
СООР HAW MNL JC 0.00 1/11 0.04 
E-I HAW MNL JC 0. 00 1/11 0. 02 
led HAW MNL JC 0. 01 1/1 0.11 
R-R HAW MNL JC 0. 00 1/11 0.00 
P-P HAW MNL JC 0. 00 1/11 0.00 
SELF HAW MNL JC 0.01 1/11 0.08 
R COL HAW MNL 0.01 1/19 0. 20 
SCAT v COL HAW MNL 0. 01 1/19 0.25 
SCAT Q COL HAW MNL 0.01 1/19 0.15 
ACCU COL HAW MNL 0. 01 1/19 0.16 
СООР COL HAW MNL 0. 01 1/19 0.10 
E-I COL HAW MNL 0. 01 1/19 0.14 
1-1 COL HAW MNL 0.01 1/19 0.18 
R-R COL HAW MNL 0. 01 1/19 0.14 
PoP COL HAW MNL 0.01 1/19 0.12 
SELF COL HAW MNL 0.01 1/19 0.14 
R COLL ALL JC 0. 01 1/32 0.35 
SCAT у COLL ALL JC 0. 01 1/32 0. 48 
SCAT Q COLL ALL JC 0.01 1/32 0.27 
Accu COLL ALL JC 0. 01 1/32 0.36 
Coop COLL ALL JC 0.01 1/32 0.26 
Eo COLL ALL JC 0.01 1/32 0.29 
1-1 COLL ALL JC 0. 02 1/32 0.51 
Есің COLL ALL JC 0. 02 1/32 0.54 
Pp COLL ALL JC 0.01 1/32 0.19 
SELF COLL ALL JC 0.01 1/32 0.36 
ч ALL JC UHHC 0. 20 1164 ae 
SCAT y ALL JC UHHC 0.25 1/164 55.91** 
SCAT Q AL. зс онно 0.21 1/164 44, 16** 


Table 5 ; 
5 is Continued on following page- 
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R? ar Е 
ACCU ALL JC UHHC 0.13 1/164 Loa 
coop ALL JC UHHC 0.12 1/164 22 Rare 
E-I ALL JC UHHC 0.12 1/164 LE d 
I-L ALL JC UHHC 0.12 1/164 43. ie" 
R-R ALL JC UHHC 0, 22 1/164 38. 86** 
P-P ALL JC UHHC 0. 09 1/164 16. 56 
SELF ALL JC UHHC 0. 08 1/164 14. 20** 
R UHHC MNLD UNIV 0.27 1/169 61. 46** 
SCAT V UHHC MNLD UNIV 0.32 1/169 81, 41** 
SCAT Q UHHC MNLD UNIV 0. 25 1/169 56. 40** 
ACCU UHHC MNLD UNIV 0.17 1/169 34, 26** 
СООР UHHC MNLD UNIV 0.17 1/169 33. 90** 
Е-1 UHHC MNLD UNIV 0.16 1/169 32. 67%% 
1-І UHHC MNLD UNIV 0.18 1/169 37. 51** 
R-R UHHC MNLD UNIV 0. 24 1/169 45. 8g** 
Фар UHHC MNLD UNIV 0.11 1/169 21.75 
SELF UHHC MNLD UNIV 0.13 1/169 25. 95% 
R UHHC MNL COL 0.22 1/166 47.39 
SCAT V UHHC MNL COL 0.28 1/166 65, 42** 
SCAT Q UHHC MNL COL 0. 23 1/166 50. 33** 
ACCU UHHC MNL COL 0.14 1/166 28. 30** 
COOP UHHC MNL Cor 0.13 1/166 24, 90** 
Е-1 UHHC MNL сог, 0.14 1/166 26. 16** 
I-L UHHC MNL Cor, 0.14 1/166 26. 94** 
R-R UHHC MNL COL 0.07 1/166 12. 76** 
P-P UHHC MNL cor, 0.10 1/166 18. 27** 
SELF UHHC MNL COL 0.09 1/166 16. 54** 
R TECH ALL COLL 0.41 1/383 269, 19** 
SCAT V TECH ALL COLL 0. 40 1/383 255, 36** 
SCAT Q TECH ALL COLL 0. 34 1/383 194. 35** 
АССИ TECH ALL COLL 0. 28 1/383 152. 66** 


Table 5 is continued on following page. 


DIXON, FUKUDA, and BERENS 33 


TABLE 5 (Continued from previous page) 


R df F 
Coop TECH ALL COLL 0. 26 1/383 136, 72** 
E-I TECH ALL COLL 0.28 1/383 147, 93** 
I-L TECH ALL COLL 0.27 1/383 143, 22** 
R-R TECH ALL COLL 0.32 1/383 160. 83** 
P =p TECH ALL COLL 0. 24 1/383 122, 30** 
SELF TECH ALL COLL 0.18 1/383 81. 67** 
R MTECH M ALL 0. 23 1/186 54,34 
SCAT v MTECH M ALL 0. 21 1/186 48. 91** 
SCAT Q MTECH M ALL 0.18 1/186 41. 14** 
ACCU MTECH M ALL 0.14 1/186 31, 21** 
Coop MTECH M ALL 0.13 1/186 27. 68** 
Жүз МТЕСН М ALL 0.14 1/186 30. 70** 
Із МТЕСН М АШ, 0.13 1/186 28. 46%% 
R-R MTECH M ALL 0. 41 1/186 87, 11** 
P-P MTECH M ALL 0.13 1/186 28.21** 
SELF MTECH M ALL 0.10 1/186 21.09** 
R FTECH F ALL 0.24 1/195 60. 94** 
5САТ ү FTECH Е ALL 0.24 1/195 62. 96** 
SCAT Q FTECH F ALL 0.16 1/195 36. 08** 
Accu FTECH F ALL 0.18 1/195 44,12** 
Coop FTECH F ALL 0. 16 1/195 36. 56** 
E-I FTECH F ALL 0.19 1/195 47. 05** 
Ке, ЕТЕСН Е ALL 0.18 1/195 43. 10** 
R-R FTECH F ALL 0. 26 1/195 60. 18** 
Pep FTECH F ALL 0.18 1/195 42,25%% 
SELF FTECH F ALL 0.10 1/195 22. 69** 


Note: ы binary sectors denoting sex group membership: 
Mae Full model regression equation: General form based on binary g group ip 


ale = Female predicting to criteria of interest. 
* 
P™ .01 level of significance. 
* р < 
+05 level of significance. 
Pu Female, 
Ms Male, 
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Ч hers’ ratings оп 
lier study (4) showed that teac g 

T p а more closely to SCAT Q than to 
SCAT V. 


A further comparison was made between the 2- 
year branch (UHHC )and mainland с ollege atten- 
dance. In this instance, the students attending the 
2-year campus (UHHC) scored significantly higher 
on SCAT scores and on teachers' ratings of соорег- 
ation (COOP), Е-І, initiative and leadership (I = 
L), R-R, and Punctuality (Р-Р) as compared with 
students attending mainland universities. Students 
attending mainland colleges, on the other hand, scor- 
ed significantly higher on ratings of ACCU, I- L, and 
selí-confidence (S - F) were also significantly high- 
erin rank inclass. Students at the 2-year UHHC 

were therefore higher on Factor 2 (“в”) than those 
attending mainland colleges, while those attending 
mainland colleges were higher on some variables of 
Factor 1 (school skills) except for COOP, E-I, R- 
R, and P- P, on which the 2-year UHHC students 
were significantly higher. The students attending 
the 2-year UHHC campus would thus be characteriz- 
es as being significantly higher in intellectual abil- 


ities and on teachers’ rating scale variables which 
measure socialization int 


comparison, stays 
near home at the 2-year UHHC campus. This par- 


elationship between 
€ of post-high school 
d that the teachers' 


ring the same per- 
sonality characteristics. 


The comparisons between technical School students 
and those attending colleges and universities reveal- 
ed a uniform superiority on all variables in favor of 
those attending the more academic institution. Tech- 
nical school students are therefore different from 
those attending college in school related Skills as 
well as on general ability factors. 


SUMMARY AND CONCLUSIONS 


Two factors were obtained from the oblique solu- 
tion to factor rotation using SCAT V and Q, teachers’ 
ratings and R. These factors were (1) a combina- 
Чоп of teachers' rating scales and R, which was la- 
beled a “school skills” factor and (2) “g,” ora 
general intellective factor with SCAT V and Q meet- 
ing the weighting criterion. 


In general, higher scores on variables w. 
on both factors differentiated between stude. 
following order: universities > colleges a 
branch of the state university ^ junior colleges >tech- 
nical schools. There were, however, some excep- 
tions to this picture. Females attending the main 
university campus in the state differed on both fac- 
tors, while males under the same comparisons only 


eighting 
nts in the 
2-year 


differed on “g” and R. For those attending colleges 
on the mainland as compared with those attending the 
main university campus, a significantly higher mean 
on R was found for those going to the mainland. A 
comparison between the 2-year campus (UH HC )and 
junior colleges showed overall superiority for the 2- 
year campus of the university on all variables, due 
probably to differences in selection procedures. А j 
comparison between mainland university students and 
those attending the 2-year UHHC campus showed the 
university students significantly higher on all variabl- 
es except SCAT Q. This suggests that verbal, rath- 
er than quantitative abilities, have a greater effect 
on mainland university attendance. 


The comparison between students attending main- 
land colleges and the 2-year UHHC campus revealed 
the most interesting relationship between the factor 
analysis and post-high school destination. Signifi- 
cantly higher scores or variables weighting on Fac- 
tor 1, showing ego self-direction in higher mean 
scores on ACCU, S-F, and I- L and higher R char- 
acterized students attending mainland colleges. Stu- 
dents attending the 2-year campus (UHHC) were sig- 
nificantly higher on Factor 2, “g,” and teachers’ 
rating of COOP, R- R, P-P, and E- I from Factor 
1, and lower on В. Students attending mainland col- 
leges are judged by teachers as being more self-con- 
fident and ego self-directed; furthermore, they are 
found to be more effective in high school, having à 
significantly higher mean R, while those attending the 
2-year UHHC campus have high “g” but are more 
submissive, conscientious, and socially controlled. 
They are, thus, more likely to remain under paren- 
tal control. High “g” does not necessarily make a 
student more ego self-directed and more venturesome; 


instead, it may relate to a greater degree of social- 
ization. 
The high 


positive correlation shown in Table 3 be- 
tween Factor 1 (school Skills) and Factor 2 aL 8 
overall intellectual ability) indicate that within this 
particular high school Setting, ability, as measured 
by the SCAT, is predictive of greater competence in 
meeting the requirements of the School system. 


FOOTNOTE 


l. This research wa; 


5 supported by National Science 
Foundation fun 


1 ds, administered by theresearch 
Council fo the University of Hawaii. The au ES 
thors are indebted to John M. Digman and Elsie 


H. Ahern for valuable assistance with this pa- 
per. 
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BEHAVIORAL PROBLEM CHILDREN IN THE SCHOOLS 


Woody, Robert H., (New York: Appleton-Century-Crofts, 1969), 264 pp. 

ts to give classroom teachers, school counselors, 
n and an approach to meet the needs of 
ditioning techniques with insight-oriented 
oraltherapy. Althoughthese theoretical 
titute the most effective means of meet- 


BEHAVIORAL Problem Children in the Schools attemp! 
and school psychologists an understanding of behavior-problem childre 
these children. This approach, which combines learning theory and con 
techniques of counseling and psychotherapy, the author calls psycho-behavi 
Positions differ, Woody feels their common elements used conjointly cons 


ing the needs of most behavioral-problem children. 
ned as ‘‘the child who cannot or will not adjust to the socially acceptable 
s his own academic progress, the learning efforts of his classmates, and 
every child has, or could conceivably have, behavior problems at some 
dered a behavioral-problem child." The reviewer would disagree with 
rticularly in view of Woody’s own statements in chap: 
ot be in a differentsituation; and 
i ence the definition. Few 
he needs of these children. 


i the personalopinions, theoretical and professional orienta 
vould argue, however, with his premise that it is urgentfor educators to learn how to meet t 
eristics of behavior problems and covers detection, re- 


s and charact 
ho- educational diagnosis, Woody feels 


far In part one, Woody outlines the cause 
th ral, and psycho- educational diagnosis. To carry out an adequate рвусһо-е 1 sis ody 
at tests must be administered, scored, and results entered; the diagnostician must give a clinical opinion as 
О the causes of the problem; and it must be determined what can be expected of the child and what can be done 
> help him, tt Psycho-educational diagnosis should involve more than one diagnostic technique” and those that 

Sody discusses as making important contributions are the social case history, the psychological survey, the 


Psychiatric, the neurological, and the pediatric examination. 

s various approaches to behavioral change such as guidance, counseling, 
does not claim that the position he advocates, namely psycho-behav- 
an outline of practical techniques that have been gleaned from the var- 

He states that not enough is known about individual differ- 
o claim that one specific approach has universal 
ld find itdifficultto disagree with this, but would 
ps more significant than the practical aspects 
and the counseling-psychotherapy 


In part two, the author discusse: 
Ds: , а 
lora] шекару, and behavior therapy. Не 
ious erapy, is a new theory, but rather li 
ence Approaches that can be used to change behaviors. s 
apii and the validity of all aspects of the different approaches 
lcability, Advocates of the diverse theoretical positions wou 


Undoubt, 1 с | 
өшуі tement that «What is регћа! 
| ining insight and а ek Pen the fact that behavior therapy 


(Continued on Page 62.) 
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THE EFFECT OF IMAGE SIZE 


ON VISUAL LEARNING 


FRANCIS M. DWYER 
The Pennsylvania State University 


ABSTRACT 


The purpose 
complement oral 
ferent sizes. Each of the 588 Ss receiv 
four individual criterial measures, 
struction does not automatically improve 
to complement oral instruction will 


al objectives via the medium of television using 22- 
inch monitors. The results demonstrated that the 
use of visual illustrations to complement oral in- 
Struction presented to college students does not nec- 
essarily improve their achievementon tests measur- 
ing different objectives. Only ononecriterial measure 
was the use of visuals found to be an important in- 
structional variable in increasing student achieve- 
ment. 


The purpose of the present evaluation was to rep- 
licate and extend the cited study. The purposes of 
the present evaluation were to determine whether : (a) 
the same results that occurred when students гесеіу- 
ed their visualized presentation via 22-inch monitors 
would occur when they received the same instruction 
by means of 5-by-3 foot front projection images and 
by 6-by-4 foot rear projection, and (b) the Same vi- 
suals presented in different sizes (i. e., 12-by~-1 foot 
5-by-3 foot, 6-by-4 foot) would be equally effective 
in facilitating student achievement of different educa- 
tional objectives. 


PROCEDURE 


The content material for this evaluation was a 
2,000-word instructional unit discussing the human 
heart, its parts, and internal operations. The taped 
instruction also contained audio signals which cued 


Results indicated that 
achievement, 


of this study was to investigate the effectiveness of four types of visual illustrations used to. 
instruction and to compare their relative effectiveness when projected on viewing areas of ile 
ed a pretest, participated in his respective presentation, and receiv 
(a) the use of illustrations to complement oral іп- а 
апа (b) merely increasing the size of visual images use 
not necessarily improve achievement. 


the change of slides so that the appropriate visuals 
appeared Simultaneously with the oral instruction 
they were designed to complement. 


Each experimental treatment was complemented 
by thirty-nine black and white Slides. These slides 
were specially designed to illustrate a specific item 
of information presented in the oral instruction, Slid- 
es used in each Sequence displayed relatively the 
Same information, differing only in the amount of 
detail they contained, Each instructional presenta- 


tion was video taped on an Ampex 660B video tape 
recorder. 


Students in each of the three studies received the 
Same instructional treatments; however, the metho 
of presentation differed for each. Students in Study 
I, the control Study, received their instruction in " 
Conventional television classrooms via 22-inch mon 
itors; students in Study II received instruction bY z 
means of a Telebeam Model A-912-A television p 
jector which provided a 5-by-3 foot projected imag! 2 
and students in Study Ш received instruction by mea” 
of à Telebeam Model А-912-А television projector 
which projected а 6-by-4 foot rear screen image- 


TREATMENT GROUPS 


Speech 200 classes at The Pennsylvania State UP? 
versity supplied the 588 Ss for this evaluation. 5 
dents in each of the studies were assigned to tre? | 
ment groups according to which of the five instructio 
Sessions they would be able to attend. Тһе 


DWYER 3n 


МЕ 
АТЫР. imental treatments were assigned to the five 
groups at random (see Table 1). 


TABLE 1 


NUMBERS OF STUD 
ENTS IN EACH TREATMEN 
FOR EACH STUDY enn 


Treatment Study 1 Study I — Study III 
Oral Presentation 

(Group I) 62 35 52 
Drawing Presentation 

(Group II) 54 35 à 
Detailed Shaded 

Presentation 

(Group III) 54 80) 2 
Heart Model Presen- 

tation (Group IV) 51 SA ap 
Photographic Presen- 

tation (Group V) 48 21 а 


"dab КОЕ. NR ee 


#8 In each study students inGroupl, the control group, 
pd na no illustrations of the heart, but they view- 
E ides containing the names of the parts and pro- 
edd of the heart as they were mentioned orally. 
E II viewed simple line illustrations of the heart. 
банып Ш viewed detailed, shaded drawings of the 
el. кы Group IV viewed photographs of a heart mod- 
АЦ апа Group У viewed realistic heart photographs. 
[a ңе reece the same oral instruction and 
for ed their respective instructional presentation 
equal amounts of time. 


CRITERIAL MEASURES 


m each study each student in each treatment group 
ds еіуеа the Otis Quick Scoring Mental Ability Test 
à pretest, participated in his respective instruc- 


ti ер 
onal presentation, and then received four individu- 


al criteri 
l criterial tests. Scores received on these tests 
terial test. 


pros Combined into a 78-item total cri 
ra objective of each test was as follows: (a) the 
о thee test evaluated learning of specific locations 
(b) i patterns and positions of the parts of the heart; 
ing he identification test measured transfer of learn- 
a es e., the ability to identify numbered parts оп 

e iren of the heart from informal 
Grow nans (c) the terminology test evaluated 
com edge of referents for specific symbols; (d) the 
heart Tension test measured understanding of the 
» its parts, and its internal operations, and (e) 
d the students total 


tal criterial test measure 
ted (see Table 2). 


Unde s 
rStanding of the concepts presen 
RESULTS 


In 

Variance study the Hartley Test for ; 
the tis. (3:94-95) was used on score 
five Quick Scoring Mental Ability Test for the 


Value of (rient groups. In по case did the observed 
the F max statistic reach the critical value 


TABLE 2 


KUDER-RICHARDSON FORMULA 20 RE LIABILITY 
COEFFICIENTS FOR THE FIVE CRITERIAL 
MEASURES 


| Study I Study П Study Ш 
Criterial (22-inch (5-by-3 foot 6-by-4 foot 
Tests monitor) front projec- rear projec- 
tion ) tion) 

Drawing Test .82 .84 .81 
Identification 

Test .19 „81 .76 
Terminology 

Test .85 .81 .83 
Comprehension 

Test ‚14 +76 ‚19 
Total Criterial 

Test ‚91 .93 .92 


for a .05 level test. Thus, it appeared that the treat- 
ment groups in each study were drawn randomly from 
populations with common variance. The F-ratiosre- 
sulting from the analysis of variance on scores achieved 
on the criterial tests indicated that significant differ- 
ences existed among the means of the five treatment 
groups on the drawing test ineach of the three studies 
(Study 1: F=7.38, df =4/264, p~ .01; Study Il: F= 
3.32, df =4/157, р < .05; Study Ш: F= 3.86, df =4/ 
152, p< .01). In the three studies no significant dif- 
ferences existed among the means of the remaining 


four criterial tests. 


For each study comparisons among the individual 
means of the five treatment groups on the drawingtest 
were conducted via Tukey's W Procedure (2:344-345). 
In each of the three studies the oral presentation with- 
out visuals was as effective as the visually comple- 
mented treatments on four of the five criterial tests. 
The exception was the drawing test, for which in all 
three studies the abstract line presentation (Group 
II) was significantly more effective than the or al 
tion without visuals (GroupI)in facilitating 


presental 
student achievement (Study I: Group п> Group 1, 
W =3.04, п/у-5/269, p ^ .01; Study П: Group п> 


Jy =5/162, p= .05; Study III: 


Group I, w -2.80, п 
5/157, p < .05). 


Group П > Group I, W -3.13, n/v= 


As was previously stated the second objective of 
this evaluation was to measure the effectiveness of 
oral instruction complemented by identical visuals of 
different sizes. Study I used 14-by-1 foot images pre- 
sented by conventional 22-inch monitors; Study П us- 
ed 5-by-3 foot front projection images; and Study Ш 
used 6-by-4 foot rear projection images. Analysis of 
was conducted on scores achieved on the Otis 
Mental Ability Test across the three 
studies, comparing the groups receiving the same in- 
structional treatment in each of the three studies. Re- 
sults indicated that students in the equivalent groupin 
each of the studies could be considered to have been 


drawn randomly from populations with common 


variance 
Quick Scoring 
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TABLE 3 


ANALYSIS OF THE EFFECTIVENESS OF THREE METHODS OF INSTR UCTIONAL PRESENTATION 
D 


Oral Presenta- 
tion (Group I) 
Criterial Tests 


Abstract Line 
(Group П) 


Instructional Treatments 


Drawing Heart Model ы ме чөң 
Presentation Presentation EISE) 
(Group III) (Group IV ) (Group 


Pretest n. s. 
Drawing 401 
Identification .05 
Terminology n. 5. 
Comprehension ns, 
Total Criterial n. S. 


5 ns ns ез 
05 ns. n.s ti 
05 05 05 п. 5 
5 n.s n.s n. S, 
5 n.s ns ns 
05 nes. ns nes 


ist (see Table 3). Tukey’s W- 
to measure the diff. 


An analysis of variar 
who received the oral р 


the drawing (F= 5, 83, 
ntification tests (F 23.74 


Study III, Wz2.67, n/v-3 ; * 
cation test (Study I > Study Ш, w - 2.22 
p~.05), instruction complemented by 15 


ive than instruction comple- 
mented by 6-by-4 foot images (see Table 4). 


For students receiving the abstract line presen- 
tation the analysis indicated that Significant differ- 
ences in mean achievement existed among the three 
methods of presentation on the drawing (Е - 3, 47, 
df =2/115, p ^ .05), identification (F-4.01, df = 
2/115, p 7.05), and total criterial test (F = 3. 49, 
dí-2/115, р=.05). An analysis of the differences 
between pairs of means on the drawing and identifi- 
cation tests indicated that instruction complemented 
by 12-by-1 foot images was more effective than in- 
Struction complemented via 6-by-4 foot геаг screen 
images (Study I ^ Study Ш, W -2.38, n/y = 3/118, 
p 7.05) and 5-by-3 foot front projection images re- 


spectively (Study I > Study II, W - 2. 30, n/v/ = 3/118, 


р = .05). Differences between the method means on 
the total criterial test approached but did not reach 

the critical value necessary for Significance at the 
. 05 level (see Table 5). 


For the detailed, shaded drawing presentation, 
significant differences were found to exist among the 
three methods of presentation on the identification 
test (F -3.62, 4-2/117, p< 05). Analysis again 
indicated that instruction complemented by 12-һу-1 
foot images presented via 22-inch television monitors 
was more effective than instruction comple mented by 


6-by-4 foot rear screen images (Study I ^ Study Ш, 
W =2.20, п/у-3/120, p- .05) (see Table 6). 


For the heart model presentation significant dif- % 
ferences were found to exist among the three method 
of presentation on the identification te st (F=3. 16, 
df -2/109, p= -05). Instruction presented to students 
via the 22-inch monitors was found to be more ӨШ 
tive than oral instruction complemented by 5-by-3 £00 
front projection images (see Table 1). 


DISC USSION 


In each of the three cited Studies the oral presen- 
tation without visuals was аз effective as the visually 
complemented treatments on four of the five criteri 
tests. The exception was the drawing test for which 
in all three Studies the abstract line presentation wa 
Significantly more effective than the oral presentatio” 
without visuals in facilitating Student achievement 
These results tend to add confidence to the explana- 
tions presented in a previous study (1:40-41 ): 


ents are generally se- 
two-thirds of the popu- 
bal and conceptual ability, 
in a highly favorable 
ing able to learn from 
If this assumption is accu- 
of visual illustrations is 
omplement oral instruction 
te learning Objectives sim- 
ured by the identification, 
prehension, and total cri- 


(b) The realistic detail cont 


i ained within the 
visual illustrat 


ions used to complement the 
may have had the net effect 
е attention of the students 
ial learning cues, thereby 


interfering witn rather than facilitating stu- 
dent achievement, 


(с) Since Students in each treatment group 


TABLE 4 
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TUKEY’S W-PROCEDURE FOR DIFFERENCES BETWEEN STUDY MEANS OF GROUP I: THE ORAL 


PRESENTATION 


A. Drawing Test 
W- Values 
N sD Mean IQ Mean Study П Study Ш 
Score 9.37 8.22 
Study I: 22-inch Monitors 62 2.9 122. 07 10. 89 1.52 2. 67** 
Study II: 5-by-3 foot Front Projection 35 3.8 122. 14 9. 37 1.15 
Study Ш: 6-by-4 foot Rear Projection 32 4.9 121.59 8.22' 
B. Identification Test W-Values 
RENE dca 
SD Mean Study П Study Ш 
12. 86 11.17 
Study I; 22-inch Monitors 3.4 13.39 53 2.22* 
Study П: 5-by-3 foot Front Projection 3.7 12. 86 1.69 
Study Ш: 6-by-4 foot Rear Projection 4.5 11.17 
* ps ‚05 
p= 01 
TABLE 5 
TUKEY'S W-PROCEDURE FOR DIFFERE BETWEEN STUDY MEANS OF GROUP П: 
ABSTRACT LINE PRESENTATION 
А. Drawing Test — 
Mean IQ Study II Study Ш 
N sD Store Mean 12. 17 11.35 
2,37% 
Study I: 22-inch Monitors 54 3.9 122.15 13.12 1:98 
1 .82 
Study П: 5-by-3 foot Front Projection 99 4.0 121.29 12.1 
1.35 
Study Ш: 6-by-4 foot Rear Projection 29 4.1 122.35 1 
B. та W-Values 
ntification Test Study II Study Ш 
sD Mean 12, 26 13. 10 
St 14. 56 2. 30* 1,46 
"dy I: 22-inch Monitors 57 T 
Stu " 0 12. 26 
dy II: 5-by-3 foot Front Projection ы 
Stu | 1 13.10 
dy ш: 6-by-4 foot Rear Projection 3. 


* 


р = ‚05 
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TABLE 6 


TUKEY’S W-PROCEDURE FOR DIFFERENCES BETWEEN STUDY MEANS OF GROUP III: 


DETAILED, SHADED DRAWING PRESENTATION 


A. Identification Test 


W-Values 
Mean IQ Study II Study III 

N SD Score Mean 13. 47 11. 86 
Study I: 22-inch Monitors 54 4.0 120. 19 14. 06 .59 2. 20* 
Study II: 5-by-3 foot Front Projection 30 $7 120. 43 13.47 1.61 
Study III: 6-by-4 foot RearProjection 36 3:7 122. 83 11. 86 
*p-.05 
TABLE 7 


TUKEY'S W-PROCEDURE FOR 


DIFFERENCES BETWEEN ST : 
HEART MODEL PRESENTATION шінші GROTE ва 


A. Identification Test 
++ сеш сайноп Test 


W-Values 
Mean IQ Study II Study III 
N SD Score Mean 12. 36 12. 86 
Study I: 22-inch Monitors 51 3.3 120. 43 14. 29 1. 93* 1.43 
Study II: 5-by-3 foot Front Projection 31 4.1 120.03 12.`36 50 
Study III: 6-by-4 foot Rear Projection 30 3.6 119.57 12. 86 


ip = 05 


viewed their respective televised presenta- 
tion for equal amounts of time, those students 
who viewed the more realistic types of vi- 
suals may not have had sufficient time to 
study and comprehend adequately the addi- 
tional information contained in the visual 
illustrations presented to them, 


The results obtained in evaluating the effective- 
ness of oral instruction complemented by visual im- 
ages of different sizes were quite interesting. The 
data indicated that merely increasing the size of in- 
structional illustrations by projecting them on larger 
viewing areas does not automatically improve their 
effectiveness. In fact, for certain learning objectiy- 
es the use of the larger images inhibited student 
achievement. The results indicated that where sig- 
nificant differences occurred the instruction present- 
ed via the 22-inch monitors was most effective in 
promoting student achievement. 


The success of the instruction presented on the 
22-inch monitors may be explained by the fact that 
the visuals presented more clearly the information 


needed by students to achieve specific objectives, 


When these same visuals were expanded on the larg" 


y making it increasingly diffi- 
егсеіуе the intended messages: 


Another possible ex 


lanation may be suggested 
to account f y 22 


Ог the effectiveness of instruction pre- 
sented by means of the conventional television топ” 
itors—the increased size of the visual images pro^ 
duced a larger viewing area which required the 
Students to spend more time searching for the rele- 
vant visual information being discussed orally. AP 
Parently the ability to be able to perceive clearly th 
relevant instructional characteristics in visuals 15 
prerequisite for visual learning, 


SUMMARY 


A numbe 
developed f: 
ful in guidi 
trations us 


r of important generalizations can Бе T 
rom the cited studies which may be helP 
ng the production and use of visual Riesi 
ed for instructional pur poses on televisio 


l. The use of visual illustrations to 


жүз е televised instruction does not 
or P eed improve student achievement 
ifferent types of educational objectives. 


c is type of visual illustration found to 
а ost effective in facilitating student 

Т іеуетепі of a specific educational ob- 
ME depends on the type of information 
n eded by the student to achieve that ob- 
jective. 


3. Merely increasing the size of visual 
images used to complement television in- 
struction will not necessarily improve stu- 
dent achievement. 


DWYER б 
1 


REFERENCES 


1. Dwyer, a ‚ ‘When Visuals are not the Mes- 
sage, ucational Broadcasting Revi ї 
20. D ing Review, 2: 


2. Sparks, J. M., “Expository Notes on the Prob- 
lem of Making Multiple Comparisons in a Com- 
pletely Randomized Design," The Journal of 


Experimental Education, 31:343-349, 1963. 
3. Winer, B.J., Statistical Principles in Exper- 


imental Design, McGraw-Hill Book Company, 
Inc., New York, 1962. 


A Gulde for Preschool Teachers 
іп Head Start-Type Programs of 
Compensatory Education 


EDITED BY 
Robert E. Clasen 


O N TO THE CLASSROOM deals 
with typical problems common to 


teachers of disadvantaged preschool 
children and contains unique suggestions 
for understanding and meeting the needs 
of these youngsters. The chapters are 
based on papers by well-qualified pro- 
fessors and professionals from the 
preschool ‘education field which were 
originally presented to a group of Head 
Start teachers needing help in the various 
areas covered. The editor says, “Since 
these works were extremel useful to one 
group of teachers, they should be 
useful to others.” 


The book begins with a chapter which 
defines “culturally deprived” ‘and offers а 
frame of reference for the thoughts and 
ideas presented in the remainder 0! 
2 the book. Each chapter was. selected 
by Dr. Clasen on one criterion: 
Does it contain information which our 
experience has shown that teachers need? 


The chapters speak for themselves: 


PLEASE SEND ME THE INDICATE! 


DEMBAR 
EDUCATIONAL 
RESEARCH 
SERVICES, INC. 


DEIS 


Post Onrice вох 1148 . MADISON, WISCONSIN, $3701 


D NUMBER OF COPIES OF: 


200 pages $7.25 Hardcover and $5.75 Softcover 


Environment (numerous hints are given on 


Creating а Learning 
limate can be created) (Chapter 2) 


how this learning c 


The Teacher, The Child and Head Start 


(е needs of children and 
a teacher's awareness are discussed) ( 


hapter 3) 


Head Start (deal 


ition and Language and 
) (Chapters 4, 5) 


Speech Language Acquis 

with language diagnosis and teaching strategies 
From a Teacher's Point of View (a humorous and heart-rending day 
to day account of organizing, canvassing, 
ming in Head Start, plus the ha s in a Head Start 
classroom from the first class day to the 

teacher's log with her commentary and s 


Head Start (A.D.C.) Mother (reveals what 
Нап. child experiences) (Chapter 7) 


A Conversation with a 
the mother of a Hea 


gramming for Parents (offers surprising views on what this 
ye all about) (Chapter 8) 


Dr. Clasen summarizes the real purpose of 

ON TO LASSROOM: “The fondest hope of each of us is 
that an idea shared through this medium SE stimulate а change 
in a teachers behavior for tl ild.” 


Pro; 


NAME. 


ADDRESS. 
STATE HL 


eo 


Jose a chock for postpaid books [9 Bill me end ГЇ poy the postage 


О! end 
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TEAM TEACHING, STUDENT ACHIEVEMENT, 


AND ATTITUDES 


NEAL R. GAMSKY 
Waupun, Wisconsin 


ABSTRACT 


This study examined the effects of team teaching on selected attitudes and achievement of ninth grade stu- 
dents in English and World History after 1 year, Seventy-four ninth grade students from à local school were 
randomly assigned to a team teaching treatment group while the remaining seventy-one members of the fresh- 
man class received traditional instruction in three separate classrooms. A teacher constructed test was em - 
ployed to measure achievement in Subject matter,and a personality inventory was used to measure changes in 
selected attitudes. The findings indicated that the team teaching approach did not appear to complement aca- 
demic growth over traditional teaching methods but it did have a significant impact on student attitudes toward 
teachers, interest in Subject matter, sense of personal freedom, and self-reliance. 


THIS 15а report of a pilot study in a larger in- specifically related to the subject matter taught, The 
vestigation of the impact of team teaching on the at- same teachers employed the same course of study to 
titudes and achievement of high school students, team-taught, flexibly grouped students and to conven” 


кы tionally taught students. Th ifi ases in^ 
Several investigations have compared team teach- vestigated b тараша урра 


ing to student achievement with varying degrees of 


success (2,3, 5, 6, 9). E sa results of 1. Ninth grade students in the team teaching 
this type of cooperative teaching upon student achieve- program will perform significantly better on 
ment has not yet been clearly established. Most re- teacher constructed tests in English and His- 
searchers have continued to rely upon standardized 


tory than students who are taught by tradition- 


tests to gauge achievement even though itis doubtful al methods. 
that most experimental treatments introduced are of 
a nature to produce differences in outcomes on these 2. Ninth grade Students who are team-taught 
measures (7). It is believed that the instrumenta- will have significantly better attitudes toward 
tion should be specially suitedto the treatment under Self, school, course work, teachers, andclasS: 
Study. Also, it is questionable whether the effects than students in regular classrooms. 
of team teaching on student achievement constitute , 
the most appropriate area for inquiry (8). More re- METHOD AND PROCEDURES 
search relating team teaching to variables other than 
achievement is needed (1). Heathers (4), ina re- Seventy-four ninth grade students from a loca! 
view of research, noted the lack of studies relating school were randomly assigned to a team teaching 
specific program aspects (teacher demonstration, treatment group while the remaining seventy-one 
team leadership, flexible grouping, ete. ) to outcomes, members of the freshman class received tradition? 
Instruction in three elf- ai 18. 
и team (acing program сазна d ET a, 
This study sought to explore the effect of team ДЕСТА: аца 2-hour period Posed 
teaching on selected attitudes and achievement of of an English teaches, “Шр, be breit pare 
A 5 : D а i h , 
ninth grade students in English and World History professional, Assistance was aie received from tb 


after 1 year using teacher constructed instruments 


librarian for library research and a secretary for 
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TABLE 1 


WITHIN AND BETWE 

i EN TREATMENT GAIN 

ENGLI; N SCORE COMPARISON D 
SH ACHIEVEMENT TESTS IN TEAM-TAUGHT AND CONVENTIONAL cL. ASSES. шаманы 


D 
= В 2 D ЕЗ 
єр N Dx-c Within Group Between 
t Pre- Post Group 
Vnd 
WORLD HISTORY = 
emester 
erI x " 2,331 78,237 31.50 3.81 33. 38* 2. 50* 
1 1.960 61,343 21.69 23. 49* = 
Semester | 
mester II x 13 1,730 49,130 23.69 3.01 2098 
с т 1.896 62,980 26.70 11513 а 
ENGLISH 
Semester I x 
s 14 1,655 42,612 22.36 2.04 
с 68 1382 34,674 20.32 ` 16. 50" we 
Seme ч 
emester II x 73 2,538 96,550 34.16 134 27. 64% 117 
с 68 2341 90,183 34.42 23. 72+ ` 


*P< „05 


8 = Experimental Group 
= Conventional Group 


cleri 

tera work. The program utilized the concept of 

rou e modular scheduling to provide for large 
p instruction, small group discussions, and in- 


de 
pendent study. Two (20 minute) modules were 


design; 
Snated for large group instruction and four mod- 
luding two modules 


ul 
frac All group activities ine 
ule for ee one module for library, andone mod- 
arranged lependent study. These units of time were 
difference in varying patterns to allow for individual 
teachers aan ability and interest. Although both 
esson ШО ared in the planning and preparation of 
or the its, one teacher usually had responsibility 
tional een aaa during the large group instruc- 
8roup E iod. Bothteachers participated inthe small 
vided ш Supplementary assistance was pro^ 
rofessi, day during the 2-hour period by the para- 
rials onal who helped in the preparation of mate- 
ORE а58 of papers, and independent study. The 
for the of study and the teaching staff were the same 
experimental and control groups. 
em test to meas- 
h semester in 


The test, 
e, and 


Ea 
i utter constructed а 100-it 
ninth grad ject matter covered in еас 
Which со е English and Social Studies. 
р ign ees of true-false, multiple-choice, а! 
One hoy, items, was designed to be completed in 


our 
ween Score Frequency of correct answers was use! 
ORE - Both experimental and control groups 
and His- 


tory wed А 
Ty for оге and posttests in English 
st and second semesters- 
sure change 


A 
reg Corr 
sujet elated t-test was used to mea 
hin eac h 


1 
rn Я ан, the instructional арргоас 
ex Чез d mean difference between pretests and 
ке imentar measured to compare gain made by the 
yes Pr al group with that made by the control 
«d the К in English and World History Ге” 
re -test for homogeneity of variance. - 
not significantly different. 


Attitude scale items were constructed by a team 
of psychologists and reviewed by the teachers who 
agreed that the attitudes listed were important and 
could be affected by team teaching. The attitude 
scale items, drawn {гот several sources such as 
common tests of personality and school attitude in- 
ventories by a 3-member panel of educators, were 
tried out in à group of thirty summer school students. 
The final hundred items were scoredon a one tofive 
agree-disagree scale, and actual responses were 
subtracted from the ideal response thereby giving а 
quantitative measure of distance from the idealatti- 
tude. The difference between actual and ideal scor- 
es for pretests and posttests were expressed as ab- 
solute values. Gain scores were meas ured by 
comparing pretest and posttest differences between 
actual and ideal scores with a correlated t-test. The 
assumption of homogeneity of variance Was tested on 
pretest total scores as well as on subtest scores. 
On no score did the F-test yield an observed value 
(equal to or greater than the critical value), the 
variances were not significantly different. 
test responses Were factor 


Prior to analysis, pre 
and varimax rotation 


zed by centroid extraction 


analy " 
using an orthogonal solution. Mean variances were 
compared. Responses ranging from strongly agree 

re used to analyze 


clusters based on 
tion of factor ex- 
traction was basedon the amount of variance accounted 
for by the factors aS well as. 
of variables in each factor. 

tified with the following subscales: 


self-concept of ability | 
A iscipline in study habits 


1 
liance (ability to act independently ) 
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TABLE 2 


J. { SCORE COMPARISONS OF STUDENTS 
N AND BETWEEN GROUP ATTITUDE GAIN 
КО Аар AND CONVENTIONAL CLASSES 


ë Postt Within Group Between Group 
subtest Pretest ша x e Post X Pre- Post 
x sd X sd Difference t Difference t 
1. Self-concept of 
ability х 16.30 7.747 15.14 661 1.155 1.571 6. 1960 
с 18.39 7.217 17. 44 8. 539 -953 1.265 6. 0275 . 19140 
2. Study habits - 
self-discipline х 13.48 5. 438 13.38 6. 519 . 098 ‚154 5.3696 
с 15.53 6. 294 16.89 7. 299 -1.359 1. 984* 5. 4812 1. 5598 
3. Attitude toward 
School x 10.28 4.574 10. 49 4. 988 -.211 .318 5. 5957 
с 11.66 5.289 11.94 5.594 -.281 414 5.4291 7.3600 
4. Self-reliance (abil- 
ity to act indepen- 
dently ) X 12.45 5.887 10.54 5. 472 1.915  3.337* 4,8366 
e 13.98 5.397 12.64 5. 533 1.344 2,500% 4,2992 ‚1227 
5. Socialization X 9.66 4,840 9.07 5. 851 .595 ‚185 6.3462 
с 9.391 5.383 10.19 6. 192 -.796 1, 173 5.4370 1.3577 
6. Attitudes toward 
teachers x 11.90 5.052 12.13 6. 500 -.225 . 264 7.1797 
с 12.72 6.222 15.53 7.848 -2.813 3.396% в, 6258 2.1681% 
7. Interest in sub- 
ject matter x 15.44 5, 458 15. 20 6.371 «239 . 300 6. 7241 
с 17.59 6. 438 19. 45 6. 563 71.859  2,748** 5,4127 1, 9838* 
8. Class participationx 14.41 6.815 14, 42 7. 007 -.014 .019 6. 2140 
с 16.48 7,028 15.44 8.657 1.047 1.131 7.4075 . 9045 
9. Respect for the 
opinion of others X 9.183 4.966 8.54 5.571 633 948 5. 63 
s s s . 6347 
С 9.969 6.409 9. 54 5. 526 210 277 43999 9417 
10. Sense of personal 
freedom x 14.27 6.031 12.59 6. 665 
© 15.08: gv E 8 Dos 1, En 2.164* 6,5264 3* 
; ; 7812 1.095 5,9385 2.308 
Composite X 127.4 37.26 121.5 43 
c 14L5 4417 56 4k Ed 5.859 1.343 36. 7638 M 
. . 74156 1.148 28,9588 1.745 
*P< .05 **р< ‚01 ***pc |07 


No = 64 (Conventional Group) 


Ny = 71 (Experimental Group) 


Socialization 

Attitudes toward teachers 
Interest in subject matter 

Class participation 

Respect for the opinion of others 
Sense of personal freedom 


Эронро 


1 
RESULTS 


The results of this study are presented in Tables 


1 and 2, The findi А data i? 
Table 1 ca ngs from an analysis of the 


n bi 
1. 


2. There we 


Те по signifi ; bé- 
tween groups f, &nificant differences 


or semester I English gains. 


T^ 
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3. Semester II World History scores showed 
no significant differences within or between 
groups. 


4. Semester II English scores showed sig- 
nificant gain within both groups, but none 
between groups. 


Analysis of tne data in Table 2 revealed the follow- 
ing major findings: 


l. Measurement of within group changes show- 
ed that the traditionally taught group experienc- 
ed significant improvement in attitudes on sub- 
test 4 (Self-reliance) but significantly negative 
growth in attitudes on subtests 2 (Study habits), 
6 (Attitudes toward teachers), and 7 (Interest 
in subject matter ). 


2. Within group analysis of the experimental 
group revealed significantly positive growth 
in attitudes on subtests 4 (Self-reliance) and 
10 (Sense of personal freedom). 


3. In comparing pretest and posttest mean 
differences between groups, the experimental 
group experienced significantly greater pos- 
itive attitude growth in subtests 6 (Attitudes 
toward teachers), 7 (Interest in subject mat- 
ter), and 10 (Sense of personal freedom). 


CONCLUSIONS AND IMPLICATIONS 


The findings tend to support those of Zweibelson 
Team teaching apparently has little impactover 

More traditional approaches on student achievement 
in high school English or Social Studies when teacher 
made tests are used to reflect the specific subject 
mer taught. Significant differences, however, were 

tained with regard to student attitudes. The team- 
taught students displayed greater gr owth in their 
feelings of self-reliance and personal freedom, had 
More positive attitudes toward the teacher and were 
more interested in the subject matter than the tra- 
ditionally taught students. 


___In spite of the fact that there were positive expor- 
imental group gains (. 07) on the total attitude scale 
arate, some attitudes were unaffected. For example, 
that Common agreement for team teaching корп 
2 Small discussion groups allow for greater class 
tha ticipation and interaction among students. Yet 
A data indicated that the traditionally taught groups 
tion Dey had the opportunity for greater yea Li 


^ class than team-taught students. T 
fos Possible long range significance for turi 
With er investigation would appear to be inc 
tera, regard to the impact of organizational an 

Ctional patterns on student attitudes. 


imited due to 


Gen " E 
t €ralizati i dy isl 
he ation from this study hers involved 


and all number of students and teac 
it моц 184 nature of the school setting: pee 
in seem i e specific ч 

S that since there are SP ticipation 


in te, ont attitudes to be gained from раг 
be we aching Шор, school officials mus 
Че vised to “сеп” these programs to ер 


Well a 
at basis rather than on the unsubstantiated 


Ont 


grounds that students learn more subject matter 
from this method of teaching. 
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INTERVENTION ON ACHIEVEMENT 


OF FIRST GRADE STUDENTS. 


THOMAS M. GOOLSBY, JR. 
University of Georgia 


. ROBERT B. FRARY 
Regional Educational Laboratory for the Carolinas and Virginia 


ABSTRACT 


Two hundred Gulfport, Mississippi, first grade students received the benefit of extensive and intensive 
treatments designed to enhance educational effect, Approximately half of these students were from definitely 


ee of readiness for first grade activities, Com- 


parison classes following the established first grade curriculum were also monitored to provide a basis for 
comparison. Results Strongly Suggest that the experimental treatments resulted in higher achievement, More 


THE CLAIM has often been made that lack of 
adequate financial and administrative support for 
schools results in inferior achievement of pupils, 
The causal basis for such outcomes is easy to hy- 
pothesize but difficult to investigate. The reason 
for this difficulty is that only through the expendi- 
ture of large sums of money can the basic 
conditions be modified within a Írame- 
work compatible with a strong research 
design 


There are studies too numerous to cite relating 
to the achievement levels attained by Students in 
poorly financed school systems as compared to those 
in more adequately financed systems, In general, 
the conclusion seems to be that the better conditions, 
generally associated with better financing, result in 
higher achievement. This conclusion, plausible as 
it may seem, is not conclusive since no Study com- 
paring achievement in two settings can possibly ac- 
count for all the differences in variables, other than 
those involving finances, which may affect achieve- 
ment. Even if intelligence, income level of parents, 
and other accessible variables are held constant, the 
multitude of inaccessible variables may tip the 
Scales їп опе direction or another. Factors such as 
community attitude toward schools, dietary differ- 
ences, teacher training differences, etc. » may have 
a strong cumulative effect. 


ifferences are their magnitudes. Enhancement of achievement due 
est for low readiness students by a wide margin, 


ment for segments of the system which have been Sub“ 
Jected to inferior learning Conditions. 


PROCEDURES 


tions, with the further requi e t that half the stu” 
dents benef ыу men à 


disadvantaged backgrounds, 


To accomplish the above objectives, ten exper" 1, 
mental classes were established in integrated and? 
Negro schools, These classes were limited to е? 
Students; апа each class was provided with an aid 
handle routine tasks. The two hundred student 
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uds to these classes were chosen randomly from 
UNUS populations, half from those known to 
Тас ived and half from the remaining population. 
oe her assignment, while not random in a formal 
bigs Pi eS made without any discernible source of 
Ж с asses, both homogeneous and heterogeneous 
dh разве: to readiness, were established based 
(MRT) scores on the Metropolitan Readiness Tests 
we Зон А (revised in 1964) (5). This meas- 

e had been obtained for all students of the district 
prior to assignment to classes. 


T Special experimental class materials designed to 
Э omote readiness and enhance the curriculum were 
гозерен by the University of Georgia Research and 
eue opment Center in Educational Stimulation for 
nd din these classes. It was believed by teachers 
dition namatia of the local system that the ad- 
edid о these materials insured a more-than-ad- 
Noob eunt iculum. The main features of the en- 
and cur riculum are repeated use of pretesting 

s posttesting to evaluate achievement, provision 
of immediate rewards, sequencing and structuring 
of learning activities, and supplementation of the 
traditional curriculum. Teachers and the commu- 
ida displayed interest and enthusiasm in the pro- 

гат. 


Table 1 lists materials used in the experimental 
classes and those used in ten comparison classes 
Which were chosen from the remaining twenty-four 
first grades of the district. These comparison class- 
eB contained twenty-eight to thirty-five students and 
received no special treatment during the year, ех- 
Cept for testing which paralleled that for the exper- 
imental classes. 


TABLE 1 


MATERIALS USED WITH COMPARISON AND 
PERIMENTAL GROUPS 


Materials Used with Experimental Groups* 


TABLE 1 (Continued from previous column) 


Pre-primary Science Program, Level 1, Zeitler, 
W.R. 


Physical Education Program: Movement Exploration, 
Gober, В. ; Albertson, L. 


SocialScience Program: Getting Acquainted, Hunt, A. 


The Ginn Basic Readers, Russell, D. H. and others, 
Ginn and Company, Boston, 1964. 


Materials Used with Comparison Groups 


Imaginary Line Handwriting, Steck- Vaughn. 


Sets and Numbers, Gundlach, B. H. ; Welch, R.C. ; 
Buffie, E.G., Laidlaw Brothers, Atlanta, 


Georgia, 1965. 


The Allyn and Bacon Basic Readers, Sheldon, Wil- 
liam D. and others, Allyn and Bacon, Atlanta, 


Georgia, 1962. 


The Ginn Basic Readers, Russell, D. H. and others, 
Ginn and Company, Boston, 1964. 


This Is Music, Sur, William R. and others, Allyn 
and Bacon, Atlanta, Georgia, 1967. 


ratt, Marjorieand 


The L. W. Singer Basic Reader, Pr 
Atlanta, Georgia, 


others, L. W. Singer Company, 
1965. 


Readers, Gray, William 


The Scott Foresman Basic 
nd Company, 


S. and others, Scott, Foresman, a 
Atlanta, Georgia, 1965. 


Today's Basic Science. 


We Live with Others, Hunnicutt, C. W. ; Grambs, 


Chicago, Illi- 


А.; Blackwood, J. ; Em- 


e 
9ncept of Culture, Hunt, 


Mons, F. (Publication No. 51a) 


Language Arts and Verbal Learning Program: Part I 
lLanguage, Jennings, 
D. ; Quirk, K. 


Btroductory Exercises in Ora 
: L.; Walter, Р.; Duhling, 


La р 
паве Arts and Verbal Learning Program: Part 
Introductory Exercises in Writing, Aaron, 


0.5 Mason, С.Е. 
La 
tage Arts and Verbal Learning Program: Part 
i Introductory Exercises іп Writing, Aaron. 
Mathan Mason, G. E. 
üvig es Program: Sugges 
M les for 5- Year-Olds, Perrodin, 
usi ! 
sic Program: Developing Basic Concepts of Mu 
(Tap? Williford, B. ; Simmons, G.M. 
Continued in next column. ) 


ted Mathematics Ac- 
А.Е. 


J.D., The L. W. Singer Company, 
nois, 1963. 


* All materials used with the experimental groups 
except the Ginn Basic Readers, were published by 
the Research and Development Center, University 


of Georgia, Athens, 1968. 


lect or establish comparison 


classes sharing many of the characteristics of theex- 
perimental ones. However, administrative consider- 
ations precluded matching each group with respectto 
every relevant variable. For example, there were 
no integrated, heterogeneously grouped classes in 
the comparison set while there меге three suchclass- 
es in the experimental set. This design deficiency 
places limitations on the interpretation of certain out- 
comes of the study but by no means invalidates the 
While the comparison class- 


clusions generally. 
P id to have been randomly constituted, 


es cannot be 5а й t 
no effect on outcomes seems likely as a result of se- 


lection bias- 


It was possible to зе. 
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In addition to MRT scores, every effort was made 
to obtain data for every experimental and compari- 
son subject for the following: 


ili : Otis- Mental Ability 

lity Measure: Otis-Lennon ^ 
аА Primary П Level, Form J (7), ad- 
ministered December 1968. 


2. Personal Data: (a) Number of Siblings; (b) 
Days Absent; (c) Ethnic Group (Negro or white); 
(d) Father's Occupation (professional, Skilled, 
or unskilled). 


3. Achievement Measures: (a) Metropolitan 
Achievement Tests, Primary I Battery, Form 

A (6) administered February 1969, Form B ad- 
ministered May 1969; (b) Botel Reading Invento- 
ry (2), administered April 1969; (c) Teachers 
Estimate of Reading Level, 2 


To analyze the achieveme 
analysis of covariance was performed. Mental age 


$, Experimental Condition: Two lev 


Р els, ехрег- 
imental and comparison. 


2. Readiness: Three levels, according to total 
scores on the MRT—below 40 = Low Readiness; 41 
adiness; above 65 = High Readi- 


3. Father's Occupation: Three levels rofes- 
sional, skilled, unskilled, eae 


4. Sex: Two levels, male and female. 


The multivariate analysis of Covariance program 
from Multivariate Statistical Pro, rams (3) wasused 
for analysis. This program is ver i i 


appropriate adjustments for Significance levels ac- 
cording to the method presented by Bock (1) The 


F-tests are based on Rao’s approximation of Wilks? 
lambda criterion, 


Because of certain departures from the formal re- 
quirements for the application of multivariate analy- 
ses of variance the reader is asked to make his own 
evaluation of certain outcomes, Nevertheless, these 
deficiencies are comparatively mi 
there is a slight tendency for the treatment to cor- 
relate with a covariate, mental age, due to the de- 
sign requirement that half the experimental Propor- 
tion come from deprived areas. This population is 
greater than the proportion of deprived in the com- 
parison group. Also there was no random assign- 
ment between treatments although there was essen- 
tially random selection for each group. In spite of 
these shortcomings, the analysis presented is judged 
to be the most informative possible in view of the 
complexity of the data. 


RESULTS AND DISCUSSION 


In Table 2, itis seen that the main effect, experimen- 
tal condition, accounted for differences incell means 
atthe .001 level of probability. An interpretation of 
this result may be made from inspection of Table 3. 


nor. For example, 


It can be observed in Table 3 that for the pee 
mentalage, the experimental group mean is 6.3 d oi 
less than that for the comparison group. In ces ars 
this deficiency, the achievement means for the rot 
imental group are almost as high as those for bee 
comparison group for several achievement me ust 
taken late in the school year. Further, after à JU 5 
ment for the covariate, the experimental го S png 
achievement means actually exceed those for кит 
parison group оп eight of the tenachievement var 


Table 2 also shows that the observed inte eer 
between father’s occupation and experimental а ДЕ 
tions would occur by chance only 4.5 percent posl 
time. An inspection of the cell means shows failure 
Such an interaction is largely the result of the em 
of children with fathers in the highest PA eem 
category to achieve higher scores than those le 
fathers are in the Second occupational categor LU 
comparison classes after adjustment for the с [а= 
iate. In experimental classes, children whose di 
thers were in the highest occupational Heke So were 
achieve higher scores than those whose ЕШ ineft 
in the second occupational category after adi lent 
for the covariate. This condition was not БЕ the 
with respect to all the achievement variables jlitan 
Study. Only in the case of the first Mene. the 
Achievement Test (MAT) Reading Subtest An val- 
univariate F-test have an associated probability 
че of less than .05. 


4 ith 

As shown in Table 2, the F ratio associated ча 
the interaction between readiness and experim d 06): 
condition сап hardly be attributed to chance (Be ants 
An inspection of the data in Table 4 readily fo score 
for this result. Note that the mean covariate rimen” 
(mental age) are nearly the same for the expe Then; 
talandcomparison groups under high Send Tl dif- 
in the high readiness column, it is observed arison 
ferences between the experimental and воз ыз, 
high readiness groups are negligible in most achiev” 
especially for variables reflecting end-of-year in 
ment. Such is not the case for the low or avari e 
readiness groups. In these two cases, the cla 
favors the comparison group. Thus, ages ін е 
lower scores Should be expected on most ТА сот” 
mental group as compared to t пейіш?) 
- Ап inspection of the low and eiae 
Teadiness columns shows some striking Дар ере 
from this expected result. In fact, іп many С grou" 
the experimenta] group exceeds the comparison. od 
The significance of the interaction may be exP ou 
i Low and medium readiness КАТА 
ly from the experimental tre imate 
ness students achieved approx!” y- 


it is 
ental program. Moreover, it lack 
è б 
5 а much worse effect оп less nos the 
5 than on those who can “cope 


of attention Һа. 
Pared student, 
Selves,» 


Table 5 


ther 
Presents descriptive statistics for Ojand 
variables i i" 


Pales in the study according to experiment? 
Comparison STOUps. It was judged that the ехр 9107 
mental grou 


Р Was considerably less well pr’ gr 
School, They came from homes where the fa уе e 
Occupation was of lower status, and families Y B It 
large. They also had much weaker vocabulario" 

15 interesting to note, however, that there wa5 
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TABLE 2 


E MULTIVARIATE ANALYSIS OF COVARIANCE USING WILKS’ LAMBDA CRITERION AND 
RAO’S APPROXIMATION 


| Factor df F p> 
Main Effects: Experimental Condition (EC ) (10, 391) 6. 44 001 
| Father’s Occupation (FO) (20, 782) 2. 80 001 
Sex (S) (10, 391) 2.48 .007 
s Readiness (R) (20,182) 5.71 .001 
First Order 
Interactions:* 
EC x FO (20, 782) 1.61 .045 
EC xS (10, 391) . 951 . 486 
EC x R (20, 782) 2.00 . 006 
| R x FO (40, 1485) 1.13 266 
/ вх5 (20, 782) . 614 ‚905 
FOxS (20, 782) 1.45 ‚091 


* No higher order interactions significant. 
TABLE 3 
: EFFECT OF EXPERIMENTAL CONDITION ON ACHIEVEMENT 


Experimental N = 181 Comparison N = 264 Univariate F Tests 


Variable Raw Mean SD Raw Mean SD F (1,400) p< 
Mental Age, months (covariate) 73.2 13.4 19.5 11.8 — - 
| MAT First Administration: 
Word Knowledge? 14.3 6.95 16.4 1.10 3.39 . 066 
005 
Word Discrimination? 14.5 6.77 17.2 1.39 7. 83 
6.57 011 
Reading? 14.3 6. 33 11.2 1.58 
Arithmetic? 33.3 16.0 38.5 15.0 1.35 ‚246 
жайы i cn 23.1 8.04 23.4 1.93 2.16 ‚097 
Word Knowledge . d 
8.18 131 ‚Л18 
Word Discrimination? 23.0 8.28 24.1 2 
Reading? 24,4 10.5 24.8 10.8 4,36 : 
n . 
4. 21 2041 
Arithmetic™ 46.5 13.9 47.1 12.5 
1 1.05 13.0 001 
Botel Instructional Level: © 2.02 1.06 un 
| Teache 
т-нун а 2270 ‚865 .353 
| Placement ion, grade T" .293 1.58 4 
сотрагіѕоп group. Magnitude of adjust- 


= i than 
Magni i for experimental prop c d ав follows: 1=preprimary; 2- pri- 
ed дв of adjusted means 15 gren than experimental group. А qe sonieeler: nr їз "ep 
Mary; 35 бі ы x eset first semester; 5- second grade. 

» 5, first grade; 4- secon ade, 


| 
КУ 


uosrreduio; = 2а 


‘aja їләувәшәв риооәв 
тезџәшіләйхя = Яр 


‘apes puodas = с ‘ләђѕәшәѕ |6111 “әрві8 риодәз = p fapead 511] = с ‘Алеша = +Хтеш1лйәл@ = [ :SAO[[0] se рәрод 5 
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TABLE 5 


MEANS AND STANDARD DEVIATIONS FOR OTHER VAR 


51 


IABLES BY EXPERIMENTAL CONDITION 


Experimental N = 181 


Comparison N = 264 


Variable Raw Mean SD Raw Mean SD 
Father's Occupation" 2.39 679 1.94 611 
Number of Siblings 4. 60 2.62 3.39 1.99. 
MRT - Total Score 42.4 20.2 51.0 18.5 
Botel Potential Level? 2. 83 1.88 4.24 2.58 
10.0 10.8 10.9 10.9 


Days Absent 


a Coded as follows: 
D Coded as follows: 1 = preprimary; 2 = primary; 
ond grade, second semester; etc. 


substantial difference in number of days absent for 
the two groups. 


It should be pointed out at this time that conclu- 
sions regarding the superiority of the experimental 
treatment are important mainly in the light of the 
magnitude of the changes effected. Literature is re- 
plete with studies which show that small class size 
is a factor in improving instruction. Clearly, how- 
ever, class size alone can never be expected to ac- 
count for large changes in educational outcome. Ha- 
berman and Larson (4) report that in the absence 
of other directions teachers given smaller than usu- 
al classes tend to present the same material in the 
same manner to their students as teachers who have 
larger classes. In the words of these investigators: 


Would cutting class size change instruc- 
tion? We doubt it. Teachers just don't 
differentiate refinements of instructional 
activities; their role perceptions are 
probably not a function of class size at all. 
If smaller classes are to make a differ- 
ence in the classroom behavior of teach- 
ers, it may be that they need to be in- 
structed on how to teach a small class 


in different ways. 


s designed not sim ply to 
size but to provide the teach- 
nd activities to per- 
ss size. 


The Gulfport Project wa: 
show the effect of class 
ers with a variety of material a 
mit them to take advantage of reduced cla: 
Again, of course, the literature is full of studies 
which show the superiority of one teaching method 
or one set of materials over another, but experience 
has shown that these results can seldom be replicat- 
ed. With its combination of activities, а smaller 
class size and special materials, the Gulfport Pro- 
ject offers a model for producing юс асади е 
in the improvement of instruction, withou - 
lem of nen replicability caused by failure to account 
for major variables within the system. For example, 
how can class size be depended upon to improve in- 


1 = professional; 2 = skilled; 3 = unskilled 
3 = first grade; 4 = second grade, first semester; 5 = sec- 


struction if there is no uniform provision of supple- 

mentary activities and material to permit the teach- 
er to capitalize from the condition? Similarly, how 

can a new method or set of material be depended up- 
on to influence instruction when teachers are forced 
to use the material under a variety of sometimes un- 
satisfactory conditions? The Gulfport Projectavoids 
the pitfalls posed by these two questions. 


From data not shown, the following additional out- 
comes were noted:? 


1. Readiness by levels of high, medium, 
and low had a strong and highly significant 
effect on achievement. The high readiness 
groups were favored on every variable, even 
after adjustment for the covariate, mental 


age. 


2. The lower readiness students have poor- 
er scores on Father's Occupation, Number 
of Siblings, Botel Potential Level and Days 


Absent. 


3. Father's occupation has а global effect 
on achievement. The computer printout 
shows that on most variables the skilled 
group was the best performer, taking the 
covariate, mental age, into consideration. 
This result strongly implies that the schools 
could obtain better results on the average 
for children from professional homes and 
quite possibly from unskilled homes as well. 


4. Readiness grouping based on the MRT 
appeared not to have a beneficial effect on 
achievement of either high or low readi- 
ness students. 


5. Negro students in integrated classes 
attained far higher achievement levels 
than those in all- Negro schools even af- 
ter adjustment for the covariates, mental 


age and father’s occupation. 
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FOOTNOTES 


1. This research was pursuant to a USOE contract, 
| a Title III ESEA Project Number 68-0671-0, 
in the Gulfport Municipal Separate School Dis- 
trict, Gulfport, Mississippi, 1968-69. 


2. Developed specifically for the Gulfport Project. 


3. Readers who are interested in a more detailed 
reporting of these and other peripheral re- 
sults should consult the technical report on 
the project by Thomas M. Goolsby, Jr. and 
Robert B. Frary, Enhancement of Education- 
al Effect Through Intensive and Extensive In- 
tervention - The Gulfport Project, Gulfport, 
Mississippi: The Gulfport Municipal Separate 
School District. Copies may be obtained by 
writing to Dr. Mercer Miller, Assistant Su- 
perintendent, Gulfport Municipal Separate 
School District, Gulfport, Mississippi. 
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EFFECTS OF ADJUNCT QUESTIONS, PRETESTING, 
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ABSTRACT 


ed study of a 23, 000-word text. The 2x2x2 design involved: (1) ad- 
junct questions versus no adjunct questions, (2) a pretest versus no pretest, and (3) supervised study versus un- 
supervised study. The posttest included the pretest items and adjunct questions along with new items. A reten- 
tion test came 7 to 13 days later. Neither pretesting nor supervision had any influence on achievement. Adjunct 
questions did improve performance, but only with respect to that subtest of the posttest comprised of the adjunct 
questions themselves. Thus, adjunct questions failed to produce the general beneficial effects on learning observ- 
ed elsewhere by others. This appears attributable to the experimental requirement of other studies that Ss read 


the instructional material only once. 


Fifty-eight students undertook self-pac: 


THE EFFECTIVENESS of instructional ma- general categories: 


terials depends not alone on properties inherent in 
1. Precondition the learner (e.g., preview the 


the materials, but also on the circumstances sur- 
rounding their use. The amount learned from agiv- material to be studied). 
en textbook, film, or other instructional exposure 
always can be influenced significantly by appropriate 2. Modify the conditions of study (e.g., require 
manipulations affecting either the learner, the ex- note-taking). 
ment in which the two 


posure itself, or the environ: | the. 
ions of this principle 3. Provide study aids (е. g.. supply learning ob- 


interact. Beneficial applicati ) F ; 
are difficult, however, because 50 little is known jectives). 
about what constitutes appropriate manipulations. It | | | 
is p to depress learning— through exorbitant lev- : The design, accordingly, was a 2x2x2 factorial 
els of ambient noise, for instance—but to enhance involving one variable from each category: 

performance is another matter, particularly when 


the materials already are tolerably good and the en- 


vironment benign. 


1. Pretest versus no pretest. 


2. Scheduled, supervised study versus indepen- 


ent was concerned with as- dent, unsupervised study. 
es of possible man- 


1f-study of expos- 


The present experim 
sessing the learning consequence 


to individual se 3. Adjunct questions versus no adjunct questions. 


ipulations relating Mel though oniy in 

i interested, though, А . | 

very КЖ АРТ practicably be put to use It was felt (a) that pretesting might induce sets 
in У 1 instructional settings such as the public to attend to the material more assiduously, espe- 

ураш ‘demanding large in- cially those portions directly related to the pretest 
the discipline deriving from 


Available tech- questions; (b) that 


schools, not with procedures 
scheduled, supervised study might promote more 


vestments of time, money, or talent. 4 
niques of the former variety appear to fall in three 
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careful reading of the text; and(c) thatadjunct ques- 
tioning might lead not only to better learning of the 
material covered by the questions themselves, but 
to better study behavior generally. Adjunct ques- 
tioning has long been advocated by Pressey (7) as 

а simple, inexpensive means of augmenting the іп- 
structional value of a text, butithas never beenclear 
whether it is the text material that is learned, or 
only the adjunct questions. A number of recentstud- 
ies (2,3,10) have demonstrated that if adjunct ques- 
tions are employed in certain special ways, they can 
indeed produce general facilitive effects as well as 
the more predictable specific ones. 


Pretesting was done as much to ensure experi- 
mental precision as to foster learning. Knowledge 
of results consequently was not supplied, nor were 


Ss told that the pretest was meant to serve asa 
learning aid, 


METHOD 
Subjects 


From the senior class of Morristown HighSchool, 
New Jersey,? fifty-eight volunteers were paid to 
serve as Ss. About half were female, half male. 
Each volunteer was paid a flat rate independent of 
the time it took to complete the required work. The 
rate, however, was made contingent on posttest per- 
formance, The sum paid was either $18, $15, or $12; 
the average time investment by the Students, includ- 
ing orientation and testing, was about 7 hours. 


The Text 


The text studied was a 23, 000- word nontechnical 
introduction to computers, prepared originally as the 
opening lesson of a self-study programming course 
for beginners. It provided a comprehensive over- 
view of how one goes about using a modern computa- 
tion center to solve scientific problems. No overt 
responses were required, 


In addition to the textual material itself, the les- 
son included the following components: 


1. А form for recording study time; 


2. A detailed table of contents; 


3. Fifteen technical illustrations (photographs, 
diagrams, examples of computer output, ete, ); 


4, А list of key terms and concepts, indexed to 
facilitate look-up; 


5 A lesson summary, about one-third the length 
of the main text; 


6. A moderately easy check-out quiz. 


Previous investigation had established that the 
lesson was well accepted by students as a self-in- 
structional instrument, and that the mean self-pac- 
ed study time was around 5 hours (4.9 hours in the 
present instance). It was known also that the lesson 
was effective in accomplishing its instructional ob- 
jectives. 


Half the 8s studied the text in exactly the form 


above; the other half studied it in conjunction with à 
separate booklet containing fifty-three adjunct ques- 
tions, a mixture of true-false, multiple choice, short 
answer, and cloze items. In the adjunct-question 
treatment, the student encountered instructions after 
every two or three pages of the text directing him to 
answer either one or two of the questions in the book- 
let. The questions invariably came after the material 
to which they related, an arrangement regarded as 
important from the standpoint of securing general, as 
opposed to specific, learning effects (2,3,4,5, 10). 
Knowledge of results was supplied by indexing each 
question to the page of the text where the answer 
could be found. 


Examinations 


Not counting the check-out quiz, which was not 


used as a criterion, three examinations were admin- E 
istered: 


l. A 24-item pretest; 


2. A posttest consisting of the twenty-four pre- | 
lest items, plus the fifty-three adjunct questions, | 
plus an additional forty-four new items (the forty” 
four new items were administered first). 


3. A retention test of thirty-six items, all dif- 
ferent from any of the foregoing. 


| 
| 


The pretest and posttest were closed-book, power 


tests; the retention test was an open-book, timed test. 
All three examinations constituted representative sam- 
plings of the lesson Content. To defeat guessing on / 
the retention test, the latter was composed entirely 
of short-answer items, The pretest and posttest, 


however, each contained a variety of item i 
er, types (like 
the adjunct questions). a 


Procedure 


esearch was carriedout, 
à 1-week spring recess; 
п a Saturday 1 weeklater — ' 
eported to the school au- 


Study at school 
the auditorium 


Both groups of Students next were told to open their 


binders, remove « T 
х , envelo „ ma 
terials found insid pe A" and complete the а 


Оп opening envelope А, half th 
Ss found both the pretest and a background question” 
found only the “ріасеһо” back 


1 g the foregoi епт. 
Баріпр іп supery egoing task, the group Теп 


receive ап арр‹ 
time, they we 
Structional mat, 


The unsupervised group were being instructed 


е” 


dnt 
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meanwhile that they could study the materials when 
and where they wished, and that a study room would 
be open at the school each day from 9:00 A. M. to 
9:00 P. M. should they elect to avail themselves of 
it. They, too, were to turn in their materials and 
obtain a posttest appointment as soon as they felt 
prepared. 


Thereupon, all Ss silently read additional instruc- 
tions telling half of them simply to “цо ahead and 
start studying, ” and the other half to open “епуе- 
lope В” containing the booklet of adjunct questions 
together with directions for their use. The unsuper- 
vised group now were dismissed to do as they pleas- 
ed, and the experiment proper got under way. 


By the second day ( Tuesday), a few of the stu- 
dents began to finish and were given appointments to 
be posttested Wednesday, approximately 24 hours 
later. This pattern of 24-hour posttest delay then 
continued through the week, with morning and after- 
noon testing sessions each day. Despite an entire 
week of glorious weather, the great majority of Ss 
were dependable and punctual in attending supervis- 
ed study periods and keeping posttest appointments. 
In analyzing the performance of individuals posttest- 
ed at different times during the week, no evidence 
of test compromise could be detected. 


On Saturday morning, 1 week later, the students 
reassembled to take the retention test and collect 
their pay. Again most were punctual, Makeup ses- 
sions were arranged for absentees, the last of whom 
was finally given the retention test on the following 
Wednesday. The time between posttest and retention 
test varied [rom 7 to 13 days, the modal span being 


9 days. 
RESULTS 


Pretest Results 


Scores on the pretest (twenty-nine Ss) were ap- 
proximately at the level of blind guessing. A 2x2 
analysis of variance showed no significant interac- 
tion or main effects. 


Readministration of the pretest items as part of 
the posttest yielded a test-retest correlation of -0.08 
and а mean gain more than eight times its standard 


Together these findings suggest that substan- 


error. 
quence of study. 


tial learning took place as a conse’ 


The experiment began with sixty-four Ss, six of 
whom dropped out. Unfortunately from a statistical 
point of view, three of the dropouts occurred in the 
same cell of the 8-cell design. We attribute no sig- 
nificance to this (if all distributions are equally prob- 
able, there is better than a 20 percent chance that 
some one of eight cells will contain three or more 
of six dropouts), but it did result in a fairly acute 
case of unequal N's. Five analyses of variance were 
therefore conducted, one using 075 for the missing 
scores, another using cell means, and three involv- 
ing different random discards of data so as to bring 
all cells down to N = 5. 
posttest- pretest items, 


The three parts of the 
ems— were analyzed sepa- 


adjunct items, and new it 


rately. Means and standard errors for the treatment 
main effects are shown in Table 1. 


Scores on both the pretest items and new items 
were found to exhibit a tendency toward either triple 
interaction or double interaction between the pretest/ 
no-pretest and supervision/no-supervision factors. 
We do not feel, however, that this tendency was of 
sufficient strength to invalidate straightforward in- 
terpretation of the main effects. (For example, of 
ten F-ratios for the double interaction—one for pre- 
test items and one for new items in each of five anal- 
yses—only one reached the .05 level of significance. ) 


All main effects for the pretest items and new items 
proved nonsignificant, the F-ratios in all analyses be- 
ing uniformly ~1. For these two parts of the post- 
test, thus, the null hypothesis appeared fully satis- 
fied. Furthermore, the pretest items and new items 
behaved very similarly from a statistical standpoint 
throughout all analyses; it was as if the pretest items 
in the posttest were simply another set of new ques- 
tions just as unfamiliar to the pretested group as to 
those who had never seen them before. Additional ev- 
idence of the similarity between the pretest items and 
new items is afforded by the correlation of 0.86 ob- 


tained between them. 


The adjunct items in the posttest indicated notri- 
ple interaction but did display a tendency toward dou- 
ble interaction the same as for pretest items and new 
items. Here again, however, we adjudge the trend 
too weak to interfere with normal interpretation of 
main effects. 


In four of the five analyses the adjunct items, not 
surprisingly, generated a highly reliable main effect 
in favor of the group who used the adjunct questions 
while studying the text (smallest F = 10.2, 1/32 df, 
Pz0.006). The only analysis in which this effect 
failed to show up strongly was the one that used 0's 
as the scores of the six missing Ss, but even thereit 
was significant at about the 0.09 level (F=3.1, 1/56 
df). 


Two of the analyses indicated a significant differ- 
ence on the adjunct items between the pretested and 
not-pretested groups. However, one of these is sus- 
pect because it is the analysis that used cell means 
for the missing scores (Е -6.1, 1/56 df, Paz0. 02), 
and the other, а ‘‘random discard’ analysis, showed 
the difference only barely reliable (F - 4. 7, 1/32 df, 
Pz-0.04). Thus, in light of its statistical frailty, and 
equally because it seems to make no intuitive sense, 
we believe this finding ought to be disregarded, 


Retention Test Results 


The retention test (see Table 1) resulted in total 
satisfaction of the null hypothesis for all interactions 
and main effects. If the interaction tendencies ob- 
served at the time of the posttest were genuine, they 
evidently disappeared during the 1- to 2-week reten- 
tion interval. 


Test Reliability 


Kuder-Richardson Formula 20 yielded coefficients 
from 0. 87 to 0. 93 for the three parts of the posttest. 
Adequate reliability for the retention test is indicated 


by its total-score correlations in the mid-to-high 108 
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MEANS AND STANDARD ERRORS BY TREATMENT MAIN EFFECT AND PERFORMANCE MEASURE 
\ 


Performance Measure 


Posttest : 
Pretest Adjunct New Retention Study Time 
Items Items Items Test (in hours) 
Main Effect (24 max.) (53 max.) (44 max.) (36 max. ) 
Pretest 14.3 31.4 27.6 20.0 4.98 
(N-29) (41.0) (22.0) (51.4) (41.3) (+0. 43) 
No Pretest 15.4 42.4 28.3 22.3 4,77 ad 
(N= 29) (0.9) (41.3) (41.2) (41.1) (+0. 23) 
Supervision 14.9 40.5 28.1 20.3 4, 84 
(N-31) (+0.8) (+1.4) (+1.1) (51.0) (+0, 20) 
No Supervision 14.7 39.2 27.8 22.3 4.92 | 
(Ne 20) (11) (21) © (5) (41.4) (40. 47) 1 
Adjunct Questions 15.4 44.8 28.6 3335 пее | 
(N- 28) (+0.9) (+1. 4) (41.2) (61.2) (+0. 25) 
No Adjunct Questions 14,3 35.3 27.3 20. 2 5.15 | 
(N= 30) (+1.0) (61.6) (41.4) (41.1) (40, 40) | 


with these various parts of the post- 
test. 


Study Times 


The time data produced a number of significant 
interactions—particularly a triple interaction- which 
are difficult to interpret. The group, for example, 
that was supervised but was not pretested and used 
no adjunct questions consumed nearly 4 hours more 
time on the average than its ‘‘opposite,’’ the group 
that was not supervised but was pretested and did 
use adjunct questions (6. 7 hours as against 2, 9). 
Looking at the marginals, however, no substantial 
time advantage accrued to either level of any main 
effect (Table 1). The largest overall time differ- 
ence was that for the adjunct-question factor, where 
the Ss using adjunct questions unexpectedly complet- 
ed their study some 34 minutes faster than thegroup 
not using adjunct questions. 


DISCUSSION 


Effects of Pretesting 


Whereas a pretest might be expected to enhance 
motivation, induce learning of the items in anticipa- 
tion of the posttest, and/or focus the learner's at- 
tention on specific parts of the material tobe studied, 


all achievement measures indicated that pretesting 


stance were the length and intricacy of the study task — . 
(i.e., any potential effects ma: 


d | pote y simply have extin- 
guished with time) and the Students’ initial low level 
of knowledge of 


0 the subject matter (i. e., the pretest 
questions may have appeared meaningless), 


Effects of Supervision 


Supervised Study, as 
Was likewise ineffectual, 
difference between the two 
of the six dropouts оссчгге 
This is not, howe 
ference (p>. 20) 


opposed to independent stud 
The only real hint of any 
treatments was that five 
d in the unsupervised grouP' 
Ver, a statistically significant dif- 


It is noteworthy that th 
School for the unsupervis. 
used. Precise records 
that fewer th: 


e study facility provided E 
ed group was almost пеуе 


26 Growing contention 
tion of the secondary» Sg 


satisfactorily thr 


-, 8) that a large propo 
ary curriculum could be acquire’ 
ough ‘learning without teaching- 


d 


p 
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Effects of Adjunct Questions 


The sole clear-cut finding of the study was that 
students exposed to adjunct questions achieved sig- 
nificantly better than those not exposed, but only with 
respect to that subtest of the posttest comprised of 
the adjunct questions themselves. Adjunct question- 
ing did not affect performance on any other part of 
the posttest or on the retention test. It thus appears 
that the adjunct questions, though generating strong 
question-specific effects, failed to produce the more 
general beneficial effects on learning observed in 
recent experiments by others. 


As Rothkopf (9) has pointedout, however, adjunct 
questions may have either beneficial or detrimental 
general effects. If both types of effect were opera- 
tive in the present case, one might reasonably ex- 
pect to find some cluster of new items in the post- 
test on which the adjunct-question group displayed 
superior performance, and another cluster on which 
their performance was inferior. To check this out, 
each of the forty-four new items of the posttest was 
analyzed, with the aid of the Lawshe nomograph (6), 
to determine if the porportion of Ss passing the item 
in the adjunct-question group was significantly larg- 
er or smaller than the proportion passing it in the 
no-adjunct-question group. It turned out that the 
two proportions differed at or beyond the .05 level 
on exactly four of the forty-four items, threetimes 
in favor of the adjunct-question group, once infavor 
of the other group. Since these results obviously 
are well within chance limits, this analysis further 
supports the view that adjunct questioning produced 


no general effects. 


General effects typically have been elicited by 
adjunct questions in other investigations only when 
the questions are placed—as was done in the present 
study—after the text material to which they pertain. 
In all these other experiments, moreover, the Ss 
have been allowed to read the instructional material 
only once. Thus, the learner realizes he will have 
but a single chance at the material and is either told 
or soon becomes aware that questions of unknown 
content await him from time to time. It is under- 
standable that these conditions should serve to 
heighten his concentration. 


But when he knows he can reread the text and re- 
questions as often as he likes— which, after 
nal process in preparing for an ex- 
e is no longer a good reason for the 
student to concentrate any harder on the text than he 
would if not aided by adjunct questions. Hence, we 
believe, the adjunct questions lose their potency ex- 
cept as emphasizers of the specific information vith 
which they deal. By the same token, it seems a like- 
ly surmise that the placement of the questions also 
loses its importance under “тезі life” study condi- 
tions, A guess is that the adjunct questions would 
have had the same impact in the present experiment 
no matter where they occurred in relation to the text. 


view the 
all, is the norn 
amination— ther 


CONCLUSIONS 


Pretesting done without feedback on шш 
material proved in this study a poor way to precon- 
dition the learner to the task ahead. Had feedback 
been provided, and had the students been instructed 
to pay close attention to the questions even when 


they did not understand them, the outcome might have 
been more favorable. There is a distinct possibility, 
however, that any pretest containing specific, detail- 
ed questions is of little conceptual help to a learner. 
А better approach might be a highly general preview, 
or “адуапсе organizer, "' of the type investigated by 
Ausubel (1). 


The study found no evident advantage for regular- 
ly scheduled, supervised study at school over un- 
scheduled, unsupervised study done mostly at home. 
Hawthorne effects undoubtedly played a role in sus- 
taining the performance of the students, but any such 
effects, we feel, were heavily counterbalanced by 
the magnificent weather enjoyed throughout the per- 
iod, a vacation week, in which the study was carried 
out. Our conclusion is that the rigorous attendance 
rules and lock-step scheduling so commonly invoked 
by high schools may be selling many students short. 


On the strength of the present findings, we are 
obliged to conclude that adjunct auto-instruction of- 
fers little promise except as a device to foster learn- 
ing of the adjunct questions themselves. The trans- 
fer induced by adjunct questioning was insufficient to 
improve performance even on the many pretest items, 
new items, and retention-test items that were relat- 
ed in content to various of the adjunct questions. All 
the same, the specific learning produced by the ad- 
junct questions was not detrimental to achievement 
on these other tests and came at no extra cost in 
study time. Hence, an affirmative recommendation 
for the judicious use of adjunct questions seems in- 


dicated. 


FOOTNOTES 


1. More accurately, the study was an independent 
sub-experiment of a larger 2x2x3 factorial. 
The third level of the last factor is omitted 
from the discussion because it involved a 
complicated re-configuration of the text 
which is cumbersome to explain and which, 
in any event, proved ineffectual. 


2. We thank the officials of the Morristown schools 
for their cooperation and assistance, notably, 
Dr. Harry М, Wenner, Superintendent of 
Schools, and Mr. William E. Kogen, Prin- 
cipal of Morristown High School. 
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ABSTRACT 


Because of low verbal skills exhibited by deaf subjects and because of earlier research results linking ver- 
bal behavior to aesthetic behavior, this preliminary investigation sought to lay some groundwork for an empiri- 
cal study of the relationship between these two ordering systems. This study states the postulates upon which 
our working paradigm is based and presents the results of a correlational analysis of syntactic and aesthetic 
preference scores of ninety-two deaf high school students (both profoundly deaf and legally deaf). The results 
appear to corroborate earlier findings linking syntactic and aesthetic behavior. The study suggests the direction 


Ay 


for an experimental study which is now in operation under a university grant. 


THE PARADIGM upon which this correlational 
study was based is related to the way in which hu- 
mans appear to organize syntax, metaphors, analo- 
gies, and other inarticulate systems. Based on 
previous research by the authors, there are some 
indicators which seem to suggest a complicated in- 
terconnection between these various ordering func- 
tions of the mind. The process of verbalization-that 
is the search for word-phrase counterparts to ex - 
to be related to the inarticulate 
ordering systems as an assimilating activity which 
seeks out the appropriate order and intent from among 
the possibilities available. The study of this process 
during the pre-committal stages of ordering may 
provide a fruitful path into the study of the systems 
themselves and their illusive interconnections. In 
art, it is as though the act of manipulating precon- 
scious alternatives increases the flexibility of the 
S permitting him to perceive greater figural possi- 
bilities in the more ambiguous drawings. Just what 
this relationship is defies current analysis, but the 
data seem to indicate that both variables-verbal ac- 
tivity and figure preference-are dependent upon each 


other, i that the process of verbalization 
teen innen s ehavior both in the 


does indeed influence aesthetic b 


perience-appears 


act of art itself and in the S's preference for figural 
stimuli. But whether or not the preconscious strat- 
egies which come into play before a syntactic com- 
mitment is made are similar to the preconscious strat- 
egies involved in aesthetic ordering is not known. 
Nor is it known whether the stimulating of such pre- 
conscious strategies in one ordering system has a 
concomitant influence on the other systems. 


Our working model suggests that such a concomi- 
tance might exist and that when Ss are permitted to 
verbalize about their own strategies they literally ex- 
pand their time of pre-committal manipulation of al- 
ternatives. It is just this flexibility which may affect 
the quality of all inarticulate products. The indica- 
tors received from two previous studies suggest that 
this is apparently the sequence. In one study with 
hearing Ss there was a strong indication that suchin- 
trinsic verbal feedback has a positive influence on 
art production quality (3). A second experimental 
study suggested a similar relationship be- 
tween the verbal feedback condition and 
figure preference although personality 
factors were involved in changes report- 
ed in the second study ( 2 jm 
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This paradigm is related to two basic observa- 
tions about the nonverbal expression of experiences. 


First, in the essentially less-verbal world of the 
deaf rests, perhaps, some of the illusive answers to 
the ‘‘how’’ questions in learning. For if learning is 
to take place at all among deaf Ss, it must be peda- 
gogically correct. The subtle and unknown methods 
by which hearing Ss learn can slip by the most con- 
scientious researcher while he incorrectly assumes 
that what he is controlling in his experiments is re- 
sponsible for the observed change. But the margin 
of error appears to be substantially reduced for the 

study of the less-verbal learner, If classroom ped- 
agogy is inadequate in the hearing classroom, learn- 
ing still seems to take place. But, if classroom peda- 
gogy isinadequate in thenonverbalclassroom, learning 
Simply does not occur at an appreciable rate, 


Second, the apparent relationship of the inarticu- 
late nature of deaf learning to the inarticulate order- 
ing function of the plastic arts, suggests that once 
we know the extent of this interaction, we may bein 
a position to discuss the larger question of how aes- 
thetic ordering relates to word meaning, concepts, 
syntax development, and reading achievement, The 
benefits of such information appear obvious, 


Many ideas and concepts appear under the rubric 
of aesthetic experience requiring perhaps a concep- 
tual definition for the reader before discussing the 
implications of the analysis herein reported, 


In all definitions of aesthetic 
dercurrent seems agreeable. 
as a form of inarticulate communication requiring at 
least two of three parts: (1) the sensed or the initial 
reaction of the artist to a stimulus, (2) the object or 
art work which derives as a tangible expression of 
the sensed or experienced, and (3) the observer who 
relates this new object to his own wealth of experience 
which may result in a pleasurable or an unpleasant 
sensation. 


experience, one un- 
It apparently serves 


The aesthetic experience differs fro 
municative acts in that it primarily de 
verbal aspects, not unlike facial ex 
ulation, and body movements. 


m other com- 
als with non- 
pression, gestic- 


It might be stated that any experience enters the 
mind at some irrational level which may immediate- 
ly vent itself in some physical sensation Or betrans- 
formed into something more rational. It must some- 
how be related or forced into a Shape which will be 
compatible with the life model of the receiver. In 
such a transformation, distortion is likely to occur, 
It is precisely this distortion-taking on a physical 
shape and substance-which evolves into a form met- 
aphor, or an image-concept of the experience, Should 
the S continue to rationalize his experience, he be- 
comes logical about it, trimming it as much as pos- 
sible by eliminating any semblance of its original ir- 
rationality. This distorted, yet compatible, form met- 
aphor has great potential for communication at the 
nonverbal level. In other words, it is inarticulate 
communication, deriving its strength from common 
sources of feeling. 


The salient pcint is that the aesthetic ordering fune- 
tion appears to be related to one of the earliest and 
perhaps most pregnant stages of the intellection pro- 
cess. Itis here that the greatest flexibility in the 


interpretation of experience can occur. Flexibility 

has been regularly cited as a quality or at least a 
primary influence on creative behavior (6, 8). One : 
of the distinctions between the artist and the non-art- 

ist isolated by empirical scales has been this char- 
acteristic (9). 


The work of Anton Ehrenzweig (4) related to the 
development oí articulation suggests that the auto- 
matic assembly of visual and aural impulses from di- 
rect experience can be-indeed in art schools is- 
taught by eliminating the gestalt figural tendency 
through conscious effort toward flexible behavior. 
The art student is trained to reconsider-all at the in- 
articulate level-the numerous possibilities for his 
particular form metaphor. 


After a point of committal is reached the mindap- Ф 
pears to enter a refining phase. The form metaphor 
may take a plastic shape as it is or be refined through 
various stages until it has been condensed into logic. | 
One method by which this refinement is accomplished | 
appears to be associated with verbal behavior, Once 
again the empirical cues suggest that the most verbal | 
students or at least those who score higher on verbal ІР 
scales of intelligence also see more possibilities in ' 
partially articulated drawings or drawings which are 
not clearly defined or regular. In addition to this іп” 
terconnection, it is becoming increasingly clear that 
planned verbal involvement in the art classroom ha$ | 
positive influence on aesthetic growth (1,2,3,5). | 
Indeed some language re. 
word itself, (both in 
symbolic device, en: 
synthesize real exp 
bridge between the 
word as a symbolic 
to the aesthetic ord 


With this developing paradigm coupled with care", | 
fully calculated experimental designs, it is hoped tb? 1 
Some patterns of behavior will emerge which can be ' 
Systematically controlled by classroom pedagogy: 

This correlational Study was designed simply 28 
an exploratory analysis which would corroborate 97 
discredit findings relating verbal behavior to figure 
Preference. The deaf population of the upper scho? 

^ lvania School for the Deaf provided 2 
Population wh iati erbal skills was шіп” 


» апу relati i ч i youP 
Would be a more ne tionship detected in this gr 


1 


i i 5 i tw 
the following variables: me ie aida ad ene 
ma еттен of deafness, апа three scores on the St 
d chievement Test (SAT). The P-values геро 
is represent only those Which reached a significant 
evel of ,95 or above for the population involved. 


FINDINGS ОЕ STATISTICAL 5 


IGNIFICANCE 
The IQ factor corr the 
ч elated strongly with all i 
long scales. Since all of these scales ee E 
predictable reflect IQ such a correlation result V 


п | 
е represented а Significant variable wh® | 
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TABLE 1 
CORRELATION MATRIX BMDO2D CORRELATION WITH TRANSGENERATION* 

Age Sex WFPT 1Q DD SAWM SAPM SAAC 
Age — 
Sex — = 
Welsh Figure Preference 

Test - Art Scale (WFPT) .2409 -.2102* — 

Intelligence Quotient (IQ) — -- -- -- 
Degree of Deafness (DD) — — -- -- -- 
SAT Word Meaning (SAWM) — -.2353 .3533 .3257 -- — 
SAT Paragraph Meaning (SAPM) .1955 -.2548 .3120 .3390 — .8436 — 
SAT Arithmetic Computation 

— — .2721 .4148 — .6827 .7164 — 


(SAAC) 


* Negative percentages reported on the sex variable refer to females. 


P-value Interpretation 


P (r^ .173) =.95 
P (r^ .242) -.99 
р (r^.338) =.999 


correlated with the measure of figure preference апа 
paragraph meaning on the achievement test. It must 
be remembered, however, that the population being 
discussed is deaf. In normal hearing populations the 
WFPT art scale is considered to be free of the age 
factor. It was unusual to find a significant age fac- 
tor for this age group related to paragraph meaning 


among the deaf. It had been assumed that the achieve- 


ment test in use did not reflect this factor. 


The sex factor was predictable in the hearing pop- 
ulation and perhaps was corroborated in the deaf pop- 
ulation. Females did significantly better on the 
WFPT art scale and on both word meaning and para- 
graph meaning although IQ and degree of deafness 
were not significantly correlated with the sex factor. 


SIGNIFICANT CORRELATIONS WITH THE 
BARRON ART SCALE 


In the deaf population a correlation between age 
and preference for the complex-asymmetrical fig- 
ures wasindicated. This relationship was unexpected 
as it had been assumed that the deaf population would 
be similar to a hearing population in this regard. 


As expected, sex was correlated with the figure 
preference score. Females tend to score higher 


than males. 


Significance at the .001 level was indicated when 
the Art Scale wascorrelated with the SAT word mean- 
ing achievement score. The Paragraph Meaning 
Score on the achievement test was significantly cor- 
related with the Art score at the .01 level. These 
correlations tend to support the researchfindings in 


{ = 92 (Ages 14-18). 


normal hearing populations related to verbal involve- 
ment and level of preference for complex-asymmet- 
ricalfigures. They suggest a similar relationship 
between reading skills, verbal fluency, andhigh scor- 
es on the WFPT Art Scale. 


The figure preference variable seemed to corre- 
late with the other ordering functions as measuredby 
word meaning, paragraph meaning, and arithmetic 
computation skills on the achievement test. There- 
fore, those deaf students who are achievers in these 
areas tend to score higher on a test of aesthetic pre- 
ference. 


The interpretation of these results is limited by 
two factors. First, the rho's reported in the analy- 
sis are not high-although they are significant for the 
population size-and, second, the correlations from 
this study do not imply cause-effect relationships but 
merely show the variables to be in some way related. 
Therefore, we will limit our interpretation to a more 
conceptual framework rather than try to establish a 
behavioristic paradigm. 


As mentioned in the introduction of this article and 
the subsequent reference to our working model or 
paradigm, it seems clear that the assumptions made 
concerning the use of intrinsic verbal feedback might 
hold for non-hearing as well as hearing Ss, This by 
itself, holds a small promise for the study of specif- 
ic verbal behaviors. It is also evident that syntactic 
development has either an indirect or direct relation- 
ship with aesthetic behavior-there is a connection. 


From а research standpoint, both of these obser- 
vations are important. We should now be free to 


62 THE JOURNAL OF EXPERIMZNTAL EDUCATION 


tackle the monumental problem of isolating various 
aspects of verbal acuity as it relates to perceptual 
or at least aesthetic sensitivity and researchers 
should be, in some measure, ready to mount an as- 
sault on the relationships which must surely exist 
between the ordering systems mentioned. 


Pedagogically, one can look forward to the isola- 
tion of stimulus-response entities related to the use 
of art, theatre, and creative writing, etc., for the 
development of word-paragraph concepts and syn- 
tactic sharpening. 


An experimental study is currently being planned 
by the authors to determine the effect of manipulat- 
ing verbalbehavior during the art process, on change 
in complex figure preference and achievement scores. 


Our current prediction equation suggests thatthe 
stimulating of both articulate and inarticulate ver- 
bal activity by involving the deaf student in his own 
evaluative feedback in art will result in а positive 
change on our criterion measures. 
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OF STUDENTS IN VARYING ENVIRONMENTS 
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ABSTRACT 


In determining whether or not students learned m 


ore effectively in different learning environments, exper- 


! x. imental and control classes were conducted over a 2-year period in a basic statistics course that involved ten 
1 ^ » faculty members and 1,172 students. Grades, common examinations, and a checklist were the basic criterion 


measures used; mathematics background and 
conclusions were: students preferring some 


types, which produced high achievement levels for specific ty pes 0! 
lations of the learning environment beyond the normal constraints of the uni- 


it probably will take drastic manipu 


overall grade point (GPA) served as control mechanisms. Major 
type of independent study consistently underachieved; instructor 


f students, were identified in behavioralterms; 


versity in order to produce effective changes in the learning pattern. 


IN THE past the nature of the learning process 
x has been studied by concentrating on such factors as 
^ s> feedback, pace, spacing, degree of immersion, de- 
1 gree of participation, mode of presentation, rein- 
forcement, sequence, etc. Interest has centered on 
the external control mechanisms of the learning en- 
vironment. Research in this manner, has, however, 
produced three areas of concern: first, in many cas- 
es characteristic differences in instructors have pro- 
duced greater learning effects than were produced by 
manipulating the external factors in the learning en- 
vironment; second, individual differences in learn- 
y ers have rarely been considered; third, seldom has 
„#* the learner been consulted on how he would like to 
4 learn a given subject matter. 


In this study, the focus was on identifying instruc- 
tor characteristics that differentially effect learning. 
Learner characteristics were also explored to see 
how they interact with the different learning environ- 
ments. Learners were interviewed as to how they 
might best learn. Thus an attempt was made to de- 
termine if these preferences are important to the 
learning process. 


Wy 
a 


Since 1960, McKeachie (9) and Katz and Sanford 
(8) have stressed the need for considering student 


characteristics in determining optimum teaching meth- 
ods; much of the research since then has reflected 
this thinking. Siegal (15) stated that one of the ma- 
jor deficiencies in research of this kind is the failure 
to consider different learning environments for dif- 
ferent types of learners. 


Doty (3) measured five student characteristics, 
including creativity, and identified a type of student 
who learned optimally under each of three instruc- 
tional methods. Іп а study comparing programmed 
learning to a conventional method, Erickson (5) de- 
termined how measures of tolerance to ambiguity , 
rigidity, and some of the Minnesota Multiphasic Per- 
sonality Inventory scales interacted. Using the 16- 
PF, McKeachie (10) found ways in which personality 
traits of students affected their learning under differ- 
ent instructors who were differentiated by observa- 
tional techniques. Initial state of knowledge was shown 
by Shuford (14) to interact with five instructional 
strategies. 


As of this writing, some major research projects 
now underway indicate the importance of the interac- 
tion of instruction method with student characteristics. 
The University of Colorado (2) is conducting a project 
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which is intended to measure the effect of personal 

ity on the usefulness of programmed learning. А pro- 
ject at Purdue (13) is determining the effect of dif- 
55 types of programmed instruction оп various 
subgroups of students differentiated by personality 
and ability profiles. The University of Oregon (6) 
is measuring the interaction of four instructional for- 
mats with thirty different personality, motivational, 
and attitude inventories in maximizing course 
achievement. 


PROCEDURE 


This project at The University of Tennessee ran 
for 2 years, encompassing 1,172 students and tenin- 
structors. The pilot phase consisted of interviews 
and open-ended questionnaires which sought to de- 
termine how students wanted to learn. During the 
experimental phase the following data was gathered: 
(1) student academic background (grades and cours- 
es taken), (2) age of students, (3) academic field 
of study, (4) student preference for type of learning 
environment, and (5) score on common exam and 
grade in statistics course. An achievement score 
was determined by subtracting a predicted grade ог 
exam score from the actual score. The predicted 
score was calculated using a multiple regression 
equation which included cumulative GPA, math GPA 
and a quality math background index (number of ad- 
vanced math courses taken) as the predictor variables, 
The correlations between predicted and actual Scor- 
es ranged from .55 to .60. 


It was thought that grade or exa m achievement 
was not a sufficient criterion measure and that some 
type of student rating of learning effectiveness was 
needed. Anderson (1) concluded that Student rating 
and achievement each measured Something different. 
Jones (7) also agrees student ratings and achieve- 
ment are independent measures, Similarly, Rem- 
mers (11) found no relationship between opinion of 
the teacher and rank in class. 


However, even given that a student's rating of his 
teacher is an independent criterion measure, there 
are many types of rating scales that could be used, 
Most rating scales use some type of overall response 
to the teacher. Whitlock (17) argues for the use of 
performance specimens as the basis of te. 
formance evaluation. Не feels that perfor 
imens have the advantage of being directly obser vable, 
having built in relevance, being related by a power 
function to the response criteria, and identifying spe- 
cific areas for remedial action. Douglass (4) used 
such a performance specimen checklist and found that 
there was a positive relationship between measures 
of teacher effectiveness which resultedfrom the check- 
list and student achievement. However, the relation- 
ship was not so strong as to deny checklist measur- 
es as being independent of achievement. 


acher per- 
mance spec- 


During the final quarter of this project, the Teacher 
Performance Checklist, developed by Douglass (4) 
was used. This checklist contained seventy-seven 
behavioral performance specimens. The student was 
asked to mark only items which he personally observed 
іп the course. Furthermore, the student was asked 
to assign a positive number if he thought the behay- 
ior contributed to the learning process, a negative 
number if it detracted, anda zero if no affective feel- 


ing was produced. For each person these weights 


were added together to produce a summation score. 
The mean stimuli intensity score was derived by di- 
viding the summation score by the number of items 
marked, At the end of the checklist, the student was 
required to make a global rating of the overall effec- 
tiveness of the teacher. 


Four different experimental sections were used in 
the basic statistics course at different points in the 
project. Two of these sections were oriented toward 
the theoretical. The emphasis in these sections was 
on derivations and understanding concepts rather than 
on problem solving. Another experimental section 
promoted more independent study by meeting only 1 
hour per week instead of 3. Students were given ex- 
tensive course outlines so they could do much of the 
work on their own. The fourth experimental section 
was designed specifically for personnel management 
majors. Many of the illustrations and problems used 


in this section were taken from the personnel man- 
agement field. 


The three major objectives of the project were: 
(1) determine if there was an interaction between stu- 
dent characteristics and learning environment with 
regard to learning efficiency, (2) describe different 
learning environments in behavioral terms by use of 
the Teacher Performance Checklist, and (3) analyze `’ 


the relationship between the various criterion meas” 
ures used. 


RESULTS 


Several student characteristics that helped describe 
how students learned under different environments 
were identified. One was à studeni 
a learning environment. 
all the students en 
preferred some {уре 


years (see Table 1). These 
ferent instructors and set 
ns. Since this would һарре! 
a hundred times, it migh 


hecklist than did younger students (see 
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TABLE 1 


STUDENT PREFEREN 
ыш” DIFFERENCES 
| 


СЕ VERSUS INSTRUCTOR IN REGULAR CLASSROOM SITUATIONS ACHIEVEMENT 


Students Preferring A Students Preferring 
Regular Class Partial Independent 
Study 
| Instructor Date Criterion Measure N Mean N Mean 
A Winter 1969 Common Exam Achievement 51 «2.20 32 $ 248 
A Winter 1968" Common Exam Achievement 13 + .29 20 +9. 62 
b 
A Winter 1968 Common Exam Achievement 31 +5.57 14 +1.03 
B Winter 1969 Common Exam Achievement 41 -2.16 49 -3."4 
b 
B Spring 1968 Grade Achievement 29 + .200 10 + .110 
B Spring 1968° Grade Achievement 5 - .080 21 - .276 
т с Winter 1969" Common Exam Achievement 20 +3. 28 11 -2. 55 
- с Winter 1969” Common Exam Achievement 16 +4.25 13 + .5T 
с Fall 1967 Grade Achievement 46 - .011 25 + .128 
| р Winter 1969 Common Exam Achievement 29 -1.25 15 -4.46 
| E Winter 1969 Common Exam Achievement 17 +2. 90 39 + .67 
- Е Fall 1967 Grade Achievement 21 + .052 18 + . 006 
“>” 
" G Fall 1967 Grade Achievement 19 + .247 9 - . 556 
H Fall 1967 Grade Achievement 19 + .142 15 - .113 
а Theoretical oriented section. b Control section. € partial independent study section, 
| TABLE 2 TABLE 3 
N STUDENT AGE GROUPS VERSUS INSTRUCTOR 
, GROUPS-COMMON EXAMINATION ACHIEVEMENT STUDENT AGES GROUPS VERSUS INSTRUCTOR 
COMPARISONS, WINTER 1969 GROUPS-OVERALL RATINGS ON PERFORMANCE 
CHECKLIST, WINTER 1969 
Age 
Age 
5 Ве 
Instructor _Under 21__ _21 and Older Instructor Groups {nder 21 ЭТ and Older 
Groups N Mean N Mean and Criterion SS =——- 
| Measure N Mean N Mean 
-1.0 66 +2.0 
А, В, апаС 171 1 à А, B, and C 
D and E 08 41.0 68 =. Summation of 
Мы ы ooe Stimuli Intensity 115 +13.9 44 +34.3 
| Mean Stimuli 
| Intensity 115 +.98 44 +2. 05 
In order to further explore the differences between Global Rating 134 2.39 51 2.58 
these instructor groups, the individual specimens on asa E 
the performance checklist were examine heir d Summation of 
which discriminate are listed in Table 4. us € Stimuli Intensity 58 450.05 38 458.2 
describes instructors A, B, and C as being " quud 
likely to apply pressure by forcing students to 3 Mean Stimuli 
plain themselves and sede Ad own cong unusually Intensity 58 +273 38 +2.45 
i i s, an : 
* ecturing above their hea a material over Global Rating 69 43.12 42 3.18 


,J$ challenging tests which cove 
4 а long iul oftime. This same instructor group 


| was described by cluster 2 ав more apt to en- 
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TABLE 4 


FACULTY GROUPS DISCRIMINATING ACHIEVEMENT OF DIFFERENT AGE STUDENTS-COMPARISON OF 


STIMULI SPECIMENS 


Faculty Groups Level of Significance 


N = 165 


A, B, 


and C 


{= of Difference Between 
fn Б Faculty Groups 


Percent 
Observed 


Mean 


Intensity 


Percent Mean Percent Mean 
Observed Intensity Observed Intensity 


Cluster 1 - More Pressure 


Forced students to qualify, explain, 


or justify statements and assertations 


made in class 


Required students to arrive at their 
own conclusions on class discussion 
or problems 


Lectured above students’ level of 
comprehension repeatedly 


Gave unusually challenging tests 
which necessitated extensive prep- 
aration and thus resulted in definite 
learning 


Administered tests infrequently, 
forcing students to cover too much 
material for a single test 


Cluster 2 - Active Leadership 


Requested and obtained students’ 
questions and reactions 


Made a dramatic gesture to 
emphasize a point 


Cluster 3 - Extra Help 


Adjusted his pace to the needs 
of the class 


Extended his office hours in 
order to further assist students 


Prepared the student for difficulties 
that might be encountered on an 
assignment 


Cluster 4 - Laissez-Faire 


Permitted classroom distractions 
to go unchecked 


Demonstrated tolerance toward 
students’ ideas even when they con- 
flicted with lectures or course 
materials 


32 


28 


36 


28 


47 


22 


66 


29 


12 


24 


-2.2 


+2.1 


-3.1 


*2.2 


*2.5 


10 *.8 .001 N.S. 


2 *2.5 .001 .02 | 


5 0 . 001 .05 | 


16 +2.9 . 001 N.S. 


25 81.5 N.S. 1001 Ё 


29 +3.6 .01 .01 


85 4.1 001 ‚001 


41 +3.6 ‚05 ‚001 


19 +3.9 N.S. ‚001 


2 -1.2 .001 <01 


28 


gage in active leadership by requestin 
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+4.0 N.S. ‚001 
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and to tolerate diverse opinions. From these spec- 
imen descriptions it might be inferred that older stu- 
dents learn more effectively in an environment which 
includes pressure and active leadership, but they are 
not particularly aided by an environment which pro- 
vides extra help and a laissez-faire atmosphere. 


The relationship between rating and achievement 
provided another interesting avenue for research. In 
general the correlation of the various performance 
ratings with common exam achievement was about 
.30. More important, instructors whose students 
achieved highest on a common exam were alsorated 
the highest by these students. This is shown in Ta- 
ble 5 and was also demonstrated by Douglass (4). 


TABLE 5 


COMPARISON OF INSTRUCTORS 
BY ACHIEVEMENT AND RATINGS 


n Mean 


Common Summatio £ 
Stimuli 


Global 


Exam of Stimuli 
Instructor Achieve- Intensity Inten- Rating 
ment sity 
Winter 1969 
E +1.9 +64 +3.1 3.4 
с 1.9 +54 +2.9 3.3 
А +1.5 +41 +2.2 3.0 
в -2.1 -25 «od ded 
D -2.9 +33 41.6 2.7 
Correlation 
with common 
exam 
achievement „Лї .76 14 


Table 6 describes how the highest rated instructor 
differed from the lowest rated instructor on specific 
specimen items. These same items also differen- 
tiated the top group of faculty nominated for an out- 
standing teaching award from another random group 
of faculty at The University of Tennessee (16). 


After 2 years of conducting this project several 
factors stood out with regard to the overall design 
of the project. There were many confounding vari- 
ables involved in the effort to describe the learning 
process. For this reason when a new learning en- 
vironment is developed or identified, a researcher 
needs to know if its effects can be generalized over 
a number of instructors and over time. In order to 
achieve this generalization the basic unit of analysis 
must consist of the class section not the individual 
student. This requires a large number of sections 
spanning a substantial period of time and including 
a variety of different instructors. Also required is 
the use of the same criterion measures over the en- 
tire project and this can be difficult when common 
examinations are used. The final factor that should 
be considered is the size of the manipulated change 


TABLE 6 


INSTRUCTOR E VERSUS INSTRUCTOR B-DIFFER- 
ENCES ON PERFORMANCE CHECKLIST ІТЕМ5- 
PERCENTAGE OF STUDENTS UNDER AN 
INSTRUCTOR OBSERVING EACH SPECIMEN 


Item Instructor Instructor 
E B 
№=65 №61 
Pointed out relationships 
between his fields and other 
fields of study 15 51 
Lectured in a monotone 25 52 
Introduced humor to stimu- 
late class interest 97 15 
Demonstrated the importance 
and significance of his subject 
matter 65 43 
Clearly stated the purpose 
and objectives of the course 80 46 
Adjusted his pace to the 
needs of the class 83 62 
Summarized material and 
showed relationships in a 
manner which aided retention 71 23 
Lectured in a manner which 
failed to hold classattention 6 48 
Refused to explain the basis 
for his grading system 3 23 
Utilized audio or visual aids 
including blackboard illustra- 
tions to clarify lesson 
materials 82 51 
Course assignments remained 
vague and disorganized 2 26 
Lectured in a rambling, dis- 
43 


organized fashion 


in the learning environment. This author feels in 
order to produce a meaningful impact on learning at 
the college level, drastic changes that go beyond the 
normal institutional constraints are required. Rosen- 
bloom (12) concurs that this type of learning envi- 
ronment research be done in research centers which 
are autonomous from colleges and universities. 


FOOTNOTE 


1. This is part of a larger study fully reported ina 
doctoral dissertation entitled ‘Optimal Learn- 
ing Environments for Different Types of Stu- 
dents,” dated August 1969. 
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ABSTRACT 


This study is based on the antecedent characteristics of 790 students enrolled in a professional degree pro- 
gram. It investigates two approaches toward improving the performance of classification techniques. The first 
approach focuses attention on the selection of predictors using scaled distances and intra-population correlations 
instead of t-ratios. The second involves the development of a Bayesian Taxonomic Procedure in addition to the 
traditional technique, the Linear Discriminant Function. It was found that reliance on t-ratios alone will not 
guarantee the selection of the best predictor set. As forthe relative performance of the classification techniques, 
no significant difference was observed. However, certain hypotheses concerning the strength and weakness of 


each technique emerged from the study. 


THE LAST four decades have witnessed a pro- 
liferation of studies designed to predict academic suc- 
cess defined in terms of course grades. The prob- 
lem was attacked from every conceivable angle; yet 
predictive validity soon reached an unsatisfactory 
plateau (10). A possible explanation for this lack of 
progress may be sought in the nature of the criterion 
which is almost always defined in terms of grades. 
A criterion so defined is subject to considerable in- 
stability, a part of which stems from the vagaries 
of grading practices (6). Exclusive emphasis on 
grades is also open to question from the standpoint 
of construct validity. Current research indicates that 
the intellectual factor is but one factor of academic 
success, and that there are other factors orthogonal 
to intelligence (4). Finally, if we view the educa- 
tional process within a broader context, as we must, 
the question arises as to how far grades correlate 
with later occupational success. Investigations in 
this area indicate no significant correlation (13). It 
is felt that the broader criterion of success versus 
failure is less open to the criticisms raised above. 
Of course, this will not eliminate the vagaries of 
grading practices. But its impact on the criterion 
is likely to be less decisive. 


For the prediction of group membership the Lin- 
ear Discriminant Function (LDF) is generally re- 
garded as the most appropriate statistical technique 
(14). There have been a number of applications of 


this technique to taxonomic problems in education. 


Although these studies represent an advance in the 
sense that they have caused some shift in emphasis 
away from grade point average (GPA) as the be-all 
and end-allof the educational process, the results have 
not always been entirely satisfactory. Even the re- 
ported performance of the LDF may be an over-esti- 
mationof what itcan really accomplish (5). It will 
be instructive to explore ways in which better pre - 
diction of group membership could be achieved. The 
present study investigates two approaches to this 
problem. 


The first approach focuses attention on the selec- 
tion of predictor variables for the LDF. Unlike re- 
gression analysis, no systematic procedures for the 
selection of variables have been developed for the LDF. 
True, some guidelines are available in the works of 
Rao (12), Lubischew (9), and others. But the con- 
tributions of these writers have not been woven into 
an integrated pattern within the framework of a gen- 
eraltheory of LDF. Consequently, a researcher, con- 
Íronted with numerous variables to choose from, has 
to “ріау it by ear." The usual practice is to select 
variables by a t-test at some level of significance. 
That this may not be the most efficient procedure has 
been demonstrated recently by Cochran (2). 


The second approach involves the application of а 
classification technique other than the LDF. Many 
variables in educational and psychological research 
are qualitative. Despite the fact that these situations 
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represent departures from the classical theory of 
LDF, one often finds that a linear function of a set 
of qualitative variables isused for classification pur- 
poses, Even when the variables are continuous, the 
assumption that the k populations havea multivariate 
normal distribution with equal dispersion on p meas- 
urements is rarely met. A classification technique 
which does not inyolve the rigorous assumptions of 
the LDF has, therefore, much to commend it, 


PROCEDURE 


Sample, Definition of Terms, and Variables 


The sample for this study (11) consisted of stu- 
dents enrolled in the Doctor of Dental Surgery pro- 
gram at the University of Detroit School of Dentistry 
from 1953-1963. The Sample size was 790, of which 
a sub-sample of one hundred Students, drawn at ran- 
dom, was set aside for cross-validation. The cri- 
terion is defined as follows: Success Group = Those 
who graduated on schedule; Failure Group = Those 
who withdrew voluntarily or involuntarily, or failed 
to graduate on schedule, 


In all, thirty-five predictor variables, including 
biographic, scholastic, and aptitude (Dental Aptitude 
Test) variables were analyzed, 


Selection of Predictors 
erection of Fredictors 


Cochran (2) has shown that if the population could 
be assumed to be multivariate normal, the taxonom- 
ic power of a variable, say Xi, is given by the scaled 
distance, dj, where 


di = (Kis = Хи) 
ip а) 
where Xis and Xi are the means of variable X; in 
populations $ and F respectively; and бір is the pool- 
ed within-group standard deviation, If the samples 
are not too small, d; would provide a 


n estimate of 
misclassification when variable Xi alone isusedfor 


classification. This is equal to the probability that 
а random normal deviate will exceed di/2, and so 
can be obtained from a table of normal distribution 
For example, if di = 2.56, 2-2,56/2- 1.28, and the 


probability of misclassification is 10 Percent; when 
a second variable is added to a variable already se- 
lected, the combined discriminatory Power of the 

variable depends upon (i) the discriminatory power 
d, of the individual variables, and (ii) the intra- Е 
population correlation, Pij.* More Specifically, the 


combined squared distance, d?» of variables X, and 


Xi (а, > ар) will be determined as follow. 


5: 
CASE 
CASE FORMULA 


A. d, Significant 
j 4? =d? * ?/0.. 
pij Significant ^ c i* [s Pit /@ 2 


(2) 
B. dj Significant 


d 1 
p; Not Significant 674144 (3) 


po 


©, dj Not Significant а? 
Pij Significant 


The implications of these formulas are worthnot- | 
ing: A negative Pij always increases the combined 


discrimination, while a positive Pij is. in most cases, 4 
harmful (Case A). If Pi; 70. the second variable con- | 


tributes to the extent of its discriminatory powe А. 
(Case В). A covariate ога “suppressor паре r 
would improve discrimination if (a) its correlation | 
with the variable already selected (positive or nega- 
tive) is high and (b) the latter itself has sizable dis- 
criminatory power (Case С). 


When more than two variables are involved, the 
formula for computing the a gets quite complex. It 


is clear, however, that if a variable does not have а 
significant di value or a significant Pij with at least 
One of the discriminants, itcould not, in any way, Cong 
tribute toward discrimination. Such riables were 
eliminated, and the deletion procedure suggested by 
Rao (12:249-255) was applied to the remaining vari- 
ables, 


The Linear Discriminant Function 
MT seri minant Function 


When the dependent variable is a dichotomy, like 
Success versus failure, what is needed is a technique 
Which predicta group membership by focusing ani 

ОП Оп the differences between roups rather than 
differences within a single mA Fisher's LDF do? 
just that. If we have p measurements, X,, Х„...' 
Xp оп two groups of sample sizes, N, and N,, the 
LDF provides maxim 
by maximizin 
Specific mea; 
or 


5 
num separation of these two grouP 
g the ratio of the difference between! i 
n5, X, and X, to variations within grou 


ni (5) 


his solution of the i i ; 
problem to linear equations, t 

subsequently Welch (15) апа Hodges (8) showe Ms 1 

M observations arise from normal populati? 


: fF 
Itis Conventional to test the effectiveness of the Ыр 


by the i 2 
анс eralized distance function, D?, using 


F= MN (NN, -p-1) + p? (6) 
P(N, +N, ) (N, +N, - 2) 


wit е 

rad re N, +N, -p- 1 degrees of freedom. Ш Ше 
NAi e 5 Significant at а suitable level, a discrim? 
quation of the following form is constructed? 

Е 1) 

y MEL LY S ( 


a e 

A individual’ s discriminant score, y, is compa, 

tion (yan the measurements made on him into ас” 

tordi and he is assigned to Group 1 or Group pov? 
ing to whether pic discriminant score falls 2 


or below a critical value established as follows: 


Mrs -&, sy" + (202у loge P, /P, ) (8) 


2(у„-у,) 
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where м", is the critical value, у, is the mean of 


the larger group, ГА is the mean of the smaller group, 
о2 у is the overall variance of y, P, is N,/(N,+N,) 


and P, is N,/(N,+N,). When P,=P,, the terms 


within simple brackets vanish so that equation (8) 
reduces to: 


(9) 


which will be readily recognized as the middle value 


of the means (in discriminant scores) of G roup 1 
and Group 2. 


The Bayesian Taxonomic Procedure 


Our point of departure is the concept of prior 
probabilities for the Success Group (G, ) and the 


This is simply the ‘‘base rat- 


ы conjunction with the prior Probabilities to derive 
tho postorior probabilities of belonging to G, and G 
by application of the Bayes Theorem: А 


P(G,/Zi) = P(Zi/G,): P(G,) 
P(Zi/G,): P(G, )P(ZyG,)- P(G,) 
(10) 
P(G,/2;) = P (2i/G, )- P(G, ) 


PEG) (о, «P (Z/a) pg.) 
(11) 


where P(G;/Zi) is the posterior probabi 
longing to Gj, given that an individual 
bute pattern vector Zi. Here Zi 22, 


lity of be- 
has the attri- 
ppt 
Фр Ze Lj where the first subscript identifies the 


attribute, and j=1 denotes the presence and j= 0 de- 
notes the absence of the attribute, 3 P(Z//G,)is the 


(G;) is the prior prob- 


a.p 


conditional probability and P 
ability. 


The derivation of all measur 
mulas (10) and (11) is illustra 
‘gram (Figure 1) where hypothe: 


FINDINGS 


Selection of Predictors 


ез required for for- 
ted in the tree dia- 
ticalfigures are used, 


Of the thirty-five variables included 
ysis, nine were eliminated because the 
criminatory power as well as intra- po 
relation with any of the discriminants, 
tors finally selected for the classificati: 
are listed in Table 1. 


in the anal- 

y lacked dis- 
pulation cor- 
The predic- 
on models 


It was found that the first two variables (X, and 
X5, ) yielded the largest combined distance than ај 
other pairs of variables; and the first three variables 
(X, , X;,, and X,, ) yielded a combined distance 


larger than all other combinations of three Meis 
In fact, the combined distance of these three varan 
es was found to be 1.081, wnile the combined EY 
of the triad with the largest t-ratios (Xes Xj, e 1 
X,,) was only 0.467. This would seem to SE қ 
that exclusive reliance on t-ratios does not et 
guarantee that the most productive variables wi te 
selected because intra-population correlation is К 
nored. Another drawback of a selection procedure 
based on t-ratios alone is the danger of шшш, 
covariates. In the present study, although s 
the covariates qualified for selection, at least 
came close to being selected. 


The Effectiveness of Models 
I —ecüveness of Models 


A model which classifies subjects as e 
the Success Group and the Failure Group pronus о 
two types of errors or misclassifications. RS 
those classified as successes may in fact be m as 
the *'false positives. " Some of those ренке 
failures may in fact be successes, the еа sed 
tives.” A classification strategy may be so tives,” 
as to minimize “false positives,” “false пе ое 
ог total misclassifications. The rescareh P mined 
usually dictates which of these should be WT Ты 
THO рроввы ььльагьн te ҮҮӨ with ihe UIS. 3 
ofa fixed number of candidates (rom a larg M 
Cant pool. Since selection may be regarded a v[alse | 
cial case of classification where the cost LN 
negatives’ is Zero or negligible, it becomes to be 
ent that mi 


nimization of “false positives ha 
the primary aim. 


TABLE 1 { 
| 
] 


THE PREDICTORS SELECTED 
FOR CLASSIFICATION MODELS 


j0 
„pat 
Predictor Scaled З 
Distance #1 500 
X,: Location of Residence 0. 329 13 | 
4 
X24: Manual Average 0. 327 i o 
‚50 
X34: Spacial Relations 0.329 à ; 
Хы Chemistry Honor Point 3. и 
Average 0.326 1 
X,: Number of Colleges 3.6 
Attended 0. 342 


f 
the 


Б formulated the classification strateBY" ct ‚ 
Performance of the 


thea priori Strategy, that is, the prevailing str? 


Havin, 


current selection proc stra 
e best estimate of the efficiency of na av^ 
“base rates, ” that is, occu! 
Proportion of successes and failures that orsi 
of төпеп this strategy was in force. For the Ure eu 
of Detroit Dental School, the base rates for t рес”! 
ively failure groupsare 0.7924 and 0.2058 rior t 
tively, д Classification model is adjudged super 


the a priori strategy if it significantly changes the 
base rates in favor of the success group. 


; For the presentresearch, two classification strat- 
egies were formulated. Strategy 1 assigned individ- 
^ uals of the cross-validation group inaccordance with 
a decision rule which imputes equal cost to both types 
| of misclassifications.* This was accomplished by ap- 
| plyingformula 8 to determine the critical value for the 
LDF, andby setting P(G, ) = 0.7924 and Р(С,)-0.2058 
for the Bayesian Taxonomic Procedure (BTP). The 
resulting decision rules are listed in Table 2. 


TABLE 2 


DECISION RULE FOR STRATEGY 1 
р 
^ The Linear Discriminant Function 


1. If y = 2. 538 Assign to б, 


2. Иу<2. 538 Assign to G, 


› 


Hho Bayesian Taxonomie Procedure 
1 ——— — 
i1 i P(G/2,) = 0. 500 Assign to б, 


0. 500 Assign to С, 


| 2, иР(6,/24) 


When assignments were made in accordance with 
^ these decision rules both models assigned practically 
all members of the cross-validation group to the suc- 
cess group (G, ). This result, disappointing though 
it may seem, underlines the problem intorduced by 
lopsided prior probabilities. Consider, for example, 
а binomial population with P, - 0. 99 and P, = 0.01. 
Since a strategy which assigns everyone to P, will 
result in a misclassification rate of only 1 percent, 
any alternative strategy should possess uncanny ас- 
Curacy to be justified in terms of incremental validity. 


Strategy 2 assigned individuals in accordance with 
à decision rule which assigned higher cost to “false 
; positives.” This was accomplished by applying for- 
, mula 9 to deter mine the critical value for the LDF, 
(C nd by setting P(G| ) = P(G; ) = 0. 500 for the BTP. 
| іші resulting decision rules аге listed in Table 3. 


were made in accordance with 


' these decision rules, the proportion correctly assign- 
ne LDF and the BTP were 


ed to the success group by t. 

0. 8933 and 0. 9077 respectively. While these propor- 
tions do not differ from one another statistically, they 
represent a significant improvement over the a priori 
Strategy which produces a te of 0. 7942. 


Significance for Future Research 
The present study reveals no significant difference 
between the performance of the LDF and the x 
This is true regardless of the strategy under whic 
erionagainst which 


assignments are made, or the crit st whi 
performance is evaluated. Perhaps the nonsignifi- 


cant difference is due to the hybrid nature of the pre- 
dictors used to develop the models. From the stand- 


22 When assignments 


success га! 


PHILLIP 13 


TABLE 3 


DECISION RULE FOR STRATEGY 2 


The Linear Discriminant Function 


1. Ку 23. 992 Assign to G, 


2. Ку = 3. 992 Assign to б, 


The Bayesian Taxonomic Procedure 


1. If P(G; /Z,) = 0. 500 Assign to G, 


2. If P(G;/Z,) = 0. 500 Assign to Gz 


point of the practitioner, the LDF and the BTP have 
their strong and weak points. For the LDF to have 
optimal properties, the following conditions must be 
met: the predictor measures should be based on ran- 
dom samples drawn from multivariate normal popu- 
lations with homogeneous dispersions, and the relation- 
shli bolwaan the prodietors and the Griterien showd 
not depart significantly from linearity. Thess cundi 
tions are rarely met in practice. Тһе ВТР involves 
the reduction of predictors into a few states, prefer- 
ably dichotomies, a procedure which may be advanta- 
geous when the relationship is sharply nonlinear, but 
which entails considerable loss of information content 
and degrees of freedom. It would therefore seem in- 
tuitively that the effectiveness of the models depends 
a great deal upon the quality of data. It would be in- 
sightful to test this hypothesis by developing both models 
from one set of data which does not violate these re- 
quirements and another set which does violate these 
requirements. The analysis performed on the first 
set would show to what extent the loss of information 
content and degrees of freedom impairs the perfor- 
mance of the BTP; and the analysis performed on the 
second set would show to what extent failure to meet 
the underlying assumptions of LDF impairs the рег- 


formance of that model. 


iar to the BTP beganto show 


Certain features ресш! 
ch. They are sum- 


up during the course of the resear 
marized below: 


a. Since the Bayesian scheme with шсш 


ed variables will generate 2 mutu- 
nd collectively exhaustive 
volume of data 


omiz 
ally exclusive ај 
attribute vectors, a large 
is a sine qua non. 


It is essential that the number of pre- 
dictors be limited to, say,four or five, 
for an increase in predictors is ac- 
companied by a tremendous prolifer- 
ation of attribute vectors and rapid de- 
n the number of observations 
Furthermore, when the 
creased, theattribute 
pattern matrix becomes quite unweildy. 
For example, ten predictors generate а 


1024x10 attribute pattern matrix. 


crease ii 
per vector. 
predictors are in 
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4. 
FOOTNOTES 
1. The author wishes to acknowledge the help of his 
: major advisor, Dr. William Reitz, in carry- 
ing out this investigation. 5. 
2. This measure is the correlation coeíficient com- 
puted with pooled variance and covariance, and 
the sign determined as follows: Let г; ізі һе 6. 
the correlation between population means, If 
X is larger or smaller than Xj in both popu- 
lations, we say that БЕ is positive, If Xi is 4. 
larger than Xj inone population and the reverse 
is true in the other population, we say that 
Tris; ÍS negative, If гиз) and the computed 8. 
correlation Coefficient have like Signs, Pij will 
be positive, if they have unlike signs, Pij will 9 
be negative. See Lubischew (9), ` 
3. In orderto developthe attribute vectors, the уагі- 
ables used for the Bayesian model were di- 10 
chotomized in а manner that minimized total Ў 
misclassifications (7). 
4. This Strategy may also be defined as one that 
minimizes totalmisclassification, See Cooley 
Lohnes (3:137) апа Birnbaum Maxwell (1:252) 18; 
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ABSTRACT 


The recency of the community college mov 
knowledge concerning the processes and steps 
а need for means to incorporate contemporary 
higher education. The purpose of this study W: 
tions of a logical sequence of events 
ing a community college as determined by practice. 
perceptions of the students. 
matched with every other, 
ed. Using the Spearman rho cor 
taken from the literature and ran 


ducation of a useful kind for 
thousands of youngsters from Florida to Washington 
and from Arizona to New England is a common ques- 
tion being asked. The answer to this question being 
proposed in many areas is the community college. 
For example, a bond issue providing $47,200, 000 for 
developing community colleges was passed by à 4to 
1 margin by voters of St. Louis, Missouri, and across 
the state Kansas City passed a $25 million bond is- 
Sue for the same purpose. The state legislature of 
Connecticut is backing plans for а statewide network 


of community colleges (3). 


HOW TO provide е 


In the fall of 1965 alone, eight junior col- 
leges opened in Alabama, Arizona, Cal- 
ifornia, Connecticut, Florida, Michigan, 
Minnesota, Nebraska, North Carolina, 
Oregon, Pennsylvania, Texas, Virginia, 
and Washington. Alabama started 11 and 
North Carolina 10 at the same time (3:1). 


_ One can hardly deny that the community college 

15 having a tremendous, if not dramatic, impact on 
higher education in the United States. A truly Amer- 
ican innovation, the community college impact, is 


for establishing a community 
The Edwar 

A sequence of twenty-nine events W: 
thus providing 406 decision situations. F 
relation technique, a coefficient of 0.82 w: 
к ordering of these eventsas determined by 


ement has prevented large scale organization and dissemination of 
required to establish a com munity college. There is, therefore, 
evidence into the curriculum of prospective administrators of 


as to compare Higher Educ 


ation Administration students' pecep- 
college with a sequence of events for establish- 
ds’ matched-pair technique was used to rankthe 
as selected from the literature, Each event was 
rom these decisions an ordering was obtain- 
as found between the sequence of events 
the Edwards’ matched-pair technique. 


and shall continue presenting a tremen- 
dous challenge for higher education administrators. 
And more particularly this challenge will no doubt 
confront those students of higher administration who 
will be assuming leadership positions in these new in- 
stitutions within the near future. 

unity college movement has 


inhibited large scale or ganization of knowledge concern- 
ing sequences of events for establishing community col- 
leges. For example, the work of Stivers (5 ) and Benson 
(1) was not available until 1962, There is therefore, а 
need for more knowledge concerning the application of 
techniques for systematizing evidence for use in cur- 
riculafor training com munity college administrators, 
One way to helpmeet this need might be the matched- 
pair technique developed by Edwards (2) to create de- 
cision situations from evidence by practitioners in the 
community college community. There is some infor- 
mation concerning the establishment of new commun- 
ity colleges. For example, Benson (1) has developed 
a Program Evaluation Review Technique (PERT )net- 
work for such a task and Stivers (5) has recorded a 
sequence of events functional for establishing anew 
community college. If information of this type could 


presenting 


The recency of the comm 
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be conveniently incorporated in training curricula 
for prospective higher education administrators, it 
Should enhance the chances for an orderly process 
for community college development. 


The purpose of this study was to compare Higher 
Education Administration students’ perceptions of a 
logical sequence of events for establishing a commu- 
nity college with a sequence of events for establish- 
ing a community college as determined by acutal 
practice. 


METHODOLOGY AND PROCEDURES 
Edwards describes the logic of this study (2). 


Let us suppose that we have ten objects of 
the same size but of differing weights and 
that we wish to arrange the objects from 
the lightest to the heaviest, We could eas- 
ily place each object on a Scale, read the 
Pointer on a dial, and record the measure 
of weight. Onthe basis of our observations, 
the objects 
Írom the lighest to the heaviest. Let's 
weighing the ob- 
Instead of weighing 
› We would present them to in- 
m to make judgments 
about the respective weights of the objects. 


The scale that we use in weighing objects 
we call a physical scale, and the ordering 
of the object in terms of measured weights 
is said to be on a physical continuum. The 
ordering of the objects upon the basis of 


judgments is said to be on a psychological 
continuum (2:19). 


The assumption was made that 
events identified by Stivers (5) de 
continuum, and that the sequence 


ed by the group in this study defin 
continuum, 


the sequence of 
fined a physical 
of events asjudg- 
ed a psychological 


The sequence of events as documented by Stivers 
(5) was arranged according to the event requiring 
the most lead time to the events requiring the least 
lead time. For example, the event requiring the 
most lead time was establishment of the need for a 
new community college. For purposes of measure- 
ment this event was assigned number one, АЦ events 
were ordered from one through twenty-nine. This 
operationally defined the physical Continuum. 


The ordering of events via the matched- 
nique operationally defined the psychologic 
uum. Each of the events in perceptual sequence was 
assigned the same number as those in the physical 
sequence. However, they were not ordered as such, 
The ordering of the events for the perceptual sequence 
was secured by responses from the matche 
comparison instrument. 


pair tech- 
al contin- 


d-pair 


An incidental sample of twenty-eight students of 
Higher Education Administration at the Ohio State 
University assumed to be a random sample from a 
population of individuals who will become adminis- 
trators of community colleges, were asked to re- 
spond to the instrument during the winter quarter of 
1966. One hour of time was allotted and all Ss re- 


sponded to all items. Sample responses were then 
submitted to the Ohio State University computer cen- 
ter for processing. (A computer program for this t 
research was designed during the summer quarter of , 
1965 at the Ohio State University. ) 


PRESENTATION OF DATA 


Table 1 presents the results of this study. It de- 
picts the comparison of the physical continuum of 
events with the psychological continuum as defined by 
Ss’ perceptions of events that take place in establish- 
ing a new community college. Since the level of meas- 
urement was ordinal a Spearman rho correlation test 
was applied to the two event sequences (4). The сог” 


relation between the physical and perceptual sequent” 5 
es was 0.82. 


Using the number of months from the starting date 
that each event took place along the physical continu- 
um as units of measure, interval measurement can 
be assumed. On this assumption a Pearson correla 


tion of 0.834 with a standard error of 0.056 was ob- 
tained, 


Statistical interpretation can be made on the ав- “ 
sumption that the group was a random sample; the 
null hypothesis wouldstate that the two event sequence- 
es were uncorrelated. A student t calculated from 
the Pearson correlation coefficient of 0.834 yielded 
a value of 7.846, For Significance at the .01 leve! 
2.77 was required, Therefore, the two variables 
were statistically significantly correlated. 


DISC USSION 


Education Administrat 


The purpose of this study was to compare Higher 
logical sequence of 


administrators, 


a 
teresting instruction 2” 
al areas. For ex гуш 
from the field indi fablishing th 1 
| 


| 
) 
| 
Е Evidence from the field indicates , 
elat ordering instructional equipment and supplies" 
E Xeenth in the sequence of events, while the stu” 
his as Seventh, Field evidence E. 
с ng up faculty offices is twenty-th 
in the 5 i 4 i ae 
eleven, ше, while the students perceived this 


Ра ign differences discussed above are attributabl: | 
bly, the Вап Perfect correlation of 0.86. Oster de 
that there р rrelation would lead one to conci 
deal of confid igh agreement and, therefore, а 8” " 
сері q ence Could be placed іп the student 

кш However, the discrepancies noted abov® 
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TABLE 1 


SEQUENCE OF EVENTS AS THEY OCCURRED IN ESTABLISHING A NEW COM 
IMMUNITY COLLEGE AND 
THEY WERE PERCEIVED AS OCCURRING BY STUDENTS OF HIGHER EDUCATION ADMINISTRATION id 


Time Before Ranking 
Opening Day Physical Sequence Ranking Students’ Perceptual Sequence Ranking Difference 
24 months 1. Establish the need 4, Estimate costs -3 
22 months 2. Obtain legal authorization 1. Establish the need 1 
20 months 3. Determine financial support 3. Determine financial support 0 
= 20 months 4. Estimate costs 2. Obtain legal authorization 2 
18 months 5. Appoint a president 18. Estimate enrollment and growth -8 
18 months 6. Begin intensive publicity 8. Determine location -2 
18 months 7. Engage an architect 18. Order instructional equipment/ 
supplies -11 
15 months 8. Determine location 5. Appoint a president 3 
6%” 12 months 9. Select key administrators 14. Set up a budget -5 
12 months 10. Provide secretarial help 10, Provide secretarial help 0 
12 months 11. Purchase office equipment/ 23. Set up faculty offices -12 
supplies 
^ 12 months 12. Solicit scholarships and books 7. Engage an architect 5 
9 months 13. Estimate enrollment and growth 17. Select instructional staff -4 
f 
9 months 14. Set up a budget 6. Begin intensive publicity 8 
6 months 15. Determine curriculum, issue 9. Select key administrators 6 
catalog 
6 months 16. Engage remaining administrators 19. Order textbooks/supplies -3 
6 months 17. Selectinstructional staff 11. Purchase office equipment/ 6 
supplies 
ini ini 2 
5 months 18. Order instructional equipment/ 16. Engage remaining administrators 
supplies 
" i i i 10 4 
2 months 19. Order textbooks/supplies 15. Determine curriculum/issue catalog 
-4 
2 months 20. Employ non-instructional help 24. Set up classrooms 
icit scholarships and books 9 
1 month 21. Set up bookstore 12. Solicit sc P 
1 
А bookstore 
ad 1 month 22, Set up lunchroom 21. Set up boi е ‚ 
, cess faculty/student handbooks 5 
1 month 23. Set up faculty offices 26. Pro! Р E 
1 27. First faculty meeting 
2 month 24. Set up classrooms sid Б s 
ч i loy non-instruction 
$ month 25. Arrange for vending machines 20. Еру 4 
m . Setu ]unchroom 
$ month 26. Processfaculty andstudenthand- 22 P 
ні di achines 2 
A e for vending M: 
faculty meeting 25. Arrang! 0 


month 21. First 
28. Registration 
0 
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А REINFORCEMENT ANALYSIS ОҒ THREE-MAN 


TEAM PERFORMANCE IN A PSYCHOLOGY COURSE 


JON E. ROECKELEIN ' 
Mesa Community College 


ABSTRACT 


Student performance was measur 
ries’? and four 3-man ‘parallel’ teams. 
ariables were responses ma 
The hypotheses dealing wi 
cerned with overall individual proficiency levels wer 
pre-team tests. It was concluded that reinforcement 
experimental laboratory to the college clas 
jectives which the course of instruc 


dependent v 
part of a team. 


SOME INVESTIGAT ORS (e.g., 2), study 
group behavior in the same manner in which individ- 
ual behavior is usually studied with operantcondition- 
ing procedures. The primary unit in these investi- 


gations is the behavior of the team rather than the 
izes the feedback and reinforce- 


individual and emphas 
ment contingencies that are produced as а function of 
the «group environment. " Using this approac h, 
Glaser and Klaus (2) formed ‘series’ and “рага1- 
lel" teams to study how group feedback, which com- 
Prised the reinforcing event, could be contingent up- 
©n combined individual performances. Briefly, ina 
Series team, if one member responds incorrectly, 
ho reinforcing feedback is presented to other mem- 
n even though they may have made correct pan 
Fe In a parallel team, on the other hand, acorrec 
Sponse by one or more members can produce a 


со 
Trect team response. 
was to attempt 


present study 
studies conducted in the labo- 
a formal ed- 


dies conducted in à п 
the methodology involv- 


arallel teams to t 
of the college class- 
al framework 

ing six hypoth- 
m learn- 


an us purpose of the 
Haters apolatig from 
(баны. (o. g., 2) to stu 
fig gt setting by applying 
генна series and 3-тап р 

Won теу complex environment 
Of a г, , Accordingly, within the gener 
бзен осете analysis, the follow 
ing әуеге set up to be tested in а с1а55г00 

Situation: 


Hypothesis 1. The structure of 


a series team, and 


ed in an introductory psychology cour: 
es weresetuptobe tes 
de on quizzes by students performing alone, 
th intra- and inter-team functioning 
e supported by substantial gai 
theory concepts were ехігаро! 


Six hypothes 


sroom when one considere 


tion sought to accomplish. 


зе organized into four 3-man ‘‘se- 
ted in a situation where the primary 
and students performing as 
werenotconfirmed, Hypotheses con- 
ns made on post-team versus 
lated successfully from the 


d the specific educational and behavioral ob- 


ocessing within a series 
all team performance in- 
dual member per for- 

ividuals when they 


the nature of information pr 
team, will result in an over 
crement; in addition, indivi 
mance levels will be higher for ind 
are members of à parallel team. 


5, Or non-redundant 


a serie: 
e rein- 


be explained by th 

permits reinforcement only 
responses. In this 
esponses should rapid- 


This characteristic of 
membership, team may 
forcing condition which 
for correct mem ber 
situation, error r 
ly extinguish. 
Hypothesis 2. The structure of à series team will 
result in the application of more ‘‘social pressure” 
to certain individual member(s) by other members 
on the team as compared with the ‘social pressure" 
found in à parallel team. 

This hypothesis is the relative impor- 
tance which the present method attaches to the con- 
sistent high-level perfor: 
member of a series team (i. е., 
bership) and was tested by compar 
individuals absent from class for serie: 


allel teams. 


non-redundant mem- 
ing the number of 
s versus раг- 


The structure of a parallel team, 
information processing within а 


Hypothesis 3. 
ll result in an overall team perfor- 


and the nature of 
rallel team, wil 
mance decrement. 
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This characteristic of a parallel, or redundant 
membership, team may be explained by the rein- 
forcing condition which permits aperiodic reinforce- 
ment for incorrect member responses. It is assum- 
ed in this situation that error responses will not 
readily extinguish. 


Hypothesis 4. For a parallel team consisting of 
redundant members who have initially divergent pro- 
ficiency levels, one very high andthe other low, team 
performance will become primarily a function of the 
more proficient member and the contributions of the 
poorer member will become increasingly small. 


This hypothesis was tested by analyzing the indi- 


vidual member’s performance on sequences of exams 
or quizzes, 


Hypothesis 5. All members of all teams in all 
sections of the class will perform individually athigh- 
er proficiency levels on Psychology exams atthe end 
of the course than at the beginning of the course as 


a result of holding membership in both series a nd 
parallel teams, 


This hypothesis was tested 
member proficienc 
of the course with 
the course, 


by comparing intra- 
у Scores made after the first week 
those made after the last Week of 


Hypothesis 6. All members of all teams in all 
sections of the class will develop individuall more 
critical ways of solving problems and will develop 
individually more rigorous study habits and attitudes, 


This hypothesis was tested by comparing intra- 
member scores on two separate forms of the Watson- 
Glaser Critical Thinking Appraisal (CTA) and the 
Brown-Holtzman Survey of Study Habits and Attitud- 
es (SSHA). 


METHOD 
Subjects and Procedure? 


Twenty-four students in an introductory psycholo- 
gy course (summer session) were divided into 3-man 
teams comprising two sections of teams: four series 
and four parallel teams, Teams and sections were 
formed 1 week after the beginning of the 5-week course 
which met for 1 hour, 45 minutes each day for 5 days 
a week. The first week was devoted to testing indi- 
vidual students to obtain individual proficiency scor- 
es which were used when forming groups to insure 
initial homogeneity across teams and Sections, Dur- 
ing the first week, students were informed that the 
two primary objectives of the course were: (1) To 
provide the student with selected facts, principles, 
and concepts of general psychology; and (2) To in- 
crease the students’ proficiency in taking multiple- 
choice and fill-in type examinations, Twoforms each 
of the CTA and SSHA were administered to the stu- 
dents, one during the first week and one during the 
last week of the course. The same final exam, cov- 
ering the entire course work in psychology, was also 
given duringthefirst and last weeks. The regular 
testing materials consisted of multiple-choice and 
fill-in type questions drawn from the textbook mate- 
rial assigned. АП questions were chosen from an 
exam pool used in other introductory psychology cours- 
es at that school. Each quiz used in the course con- 


sisted of ten questions. Response bias across quiz- 
zes was carefully controlled; that is, all quizzes 
had an overall equivalent level of difficulty based up- 
on previous usage of questions. During the course, 
no formal lectures were scheduled by the instructor; 
however, а table labeled “Lecture Table” was avail- 
able to all teams throughout the course and a team 
could meet with the instructor for 5-6 minutes at à 
time for short lectures on any material which was 
unclear to the team concerning a particular assign- 
ment. 


The overall design of the present study may be 
classified as “‘quasi-experimental, time-series 
(1). The intention was to provide situations where 
periodic measurement instruments (quizzes) could 
be administered to the group and certain experimen- 
tal changes (series and parallel team conditions) 
could be applied at various places throughout the ex^ 


perimental space (college course in introductory 
psychology ). 


RESULTS AND DISCUSSION 


Table 1 contains the results of three analyses in^ 
volving the total number of error responses produt- 
ed by series and parallel teams (all students serve! 
under both series and parallel team conditions for 
àn equivalent amount of time). Two of the analyses 
Were made by splitting each type of team, series ОГ 
Parallel, in half and comparing the halves (Series 
А, Series B, or Parallel A, Parallel B) through Ше 
application of statistical tests, Since there were 2? 
odd number of days in the total number of days use 
it was decided arbitrarily 


Sis; thus, the middle day 
“buffer” day which conve: 


were used, 
evaluati in- 
dividual кы cam participation upon i? 

While all of the split- | 
comparisons indicate 


teams in s rsus parallel team ne 
Че i n Section I performed first un^ | 
allel ane Сапа condition and then under the ра” 
; 3 я 
under the Parallel te eams in Section II perform! 


tu А е: 
team condition am condition first and the serie? 
sition or Шеше а Thus, a potential time-P? 


the fe may have been operative | 
first, whether sep ies Condition which occur 


was the optimum ong? Pe or parallel ty P*' 


b 


ROECKELEIN s 


TABLE 1 


ERROR RESPONSES OF SPLIT-HALF SERIES AND 
PARALLEL TEAMS AND SERIES VERSUS 
PARALLEL TEAMS 


Teams Section I* Section П 

Series A, B 204, 157 227, 155 
p (ns)? p (ns)? 

Parallel A, B 297, 176 154, 143 
p^ .05* p (ns)? 

Series, Parallel 449, 534 437, 382 
p (ns)? p (ns) 


* i " ^ 
Section I contained four teams, Section II contain- 


ed three teams. 


a 
Determined by t-tests. 
for matched pairs (3). 


bpetermined by sign tests 


The record of daily absences from class during 
all team conditions revealed that Section I series 
teams had no absences, Section I parallel teams had 
35 percent, Section II series teams had 15 percent, 
and Section II parallel teams had 50 percent of the 
total number of absences. In summary, across sec- 
tions, the series teams were responsible for 15 рег- 
Cent of the total number of absences, while the раг- 
allel teams were responsible for 85 percent. 


half parallel teams 


A further analysis of the split- 
oficiency levels 


in relation to individual member's pr 
Showed that in both sec 
ally low proficiency men 
mances over time (28% showed decrements ); 
the majority of low pro 
ular increments and mad 
jean’ output as time progressed. 

3 percent of the initially high profi 
gave poorer performances (43% pro 
and 14% showed no change) over time. Т 
appears to have been some sort of «Jeveling’’ factor 
present wherein the initially high proficiency тете 
ber and the initially low proficiency member approach 
а mutual or common proficiency level. 


sult led us to const 
and par 


-edic- 
The foregoing re sinl gae 


tion table concernin| 


ee (2) with series teams, 1 
th ormance of team member 
Um team performance, and was а m 
On of the probability that each teat 
осы correctly оп апу опе question oF Storer 
dior ёоуег, in a parallel team situation, Е 
Phage that when a leader origi ally performs ant high 
Oficiency levels, the addition of е 
Proficiency members would add sub 


nt of the origin- 
d their perfor- 
that is, 


output. Ontheother hand, whena paral 

originally performed at high а Е à 
was predicted that the addition of two redundant low 
proficiency members would subtract substantially 
{гот the team output. Proficiency values determin- 
ed by the actual classroom performance levels of 
individuals and teams are also shown in Table 2. For 
the parallel teams, 86 percent of the team leaders 
led their teams to actual proficiency levels above the 
predicted range, while only 14 percent of the team 
leaders placed their teams at actual proficiency lev- 
els within, or below, the predicted range. In the 
case of series teams and the predicted versus actu- 
al proficiency levels, without exception the actual 
levels were above the predicted levels of proficien- 
cy for all teams. Thus, there were substantial team 
gains under both series and parallel team conditions 
and neither the multiplicative law for series teams, 


nor the formula of decrements, 
/м, ((L+ Mz (1.00- L) ))/ 


етей to hold completely in the 


for parallel teams ве! 
d relatively complex 


present study which involve! 
learning tasks. 


Table 3 shows individual student proficiency lev- 
els before and after holding membership on teams 
under both series and parallel team conditions. The 
results in Table 3 are reported in terms of gains of 
the proportion of correct responses {ог both the 
quizzes and the final exams which were given after 
team formation and before team participation. Only 
one student failed to show a gain on the quiz mate- 
rial from the pre-team to post-team testing situa- 
tions; on the same final exam, however, the student 
did show a gain in proficiency. A calculation of the 
average gain from pre-team to post-team conditions 
for all students in the class who served on teams 
showed а proportion gain of „20 for the quizzes and 
147 for the final exam. The scores from the two 
students whose team in the second 
week showed averag 


the quizzes and .3 
its small size, this 2-student control group data was 


considered to b further statistical eval- 

uations were made using itas non- treatmentcriteria. 
All the students iginally part of a larger рор” 
ulation which ге ; students were 
randomly assigned to the treatment group by taking 

the names on the 1 
sheet which was made during the 
tration period. Both groups of S 


time е: 
course concurrently at the same 
under the traditional lecture method and the 


a - method. The final ex- 
other under the 3 Қане 


iven on both exams. 
tions (out of à total 
on each exam this procedure, and 
basisfor comparison оаа group 
i , A t-test o е егепсе 

with ал rr ey ses of the two groups show- 
he untreated group was far inferior to the 


ed that t! 5 bi 
о 5 
tero) and again indicated the effective- 


z procedure used with the 3- 


There were 


the multiple qui 


The results from the pre-team and post-team 
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TABLE 2 


{р ACTUA E h A 1 LEV Ж 
NDIVIDUAL AND TEAM PROFICIENCY ELS 

ICTED AND 

PRED 


2 rallel 
Predicted Actual Predicted Parallel Actual Paralle 


^21 Meet 
Series Series Team Level Under T o кре er 
a % Leadership of Leaders 
Team Member A Member B Member C iram bei а эши | 
i A B [e A n i 
S .86 .95 ‚10 .57 ‚84 
i -68 .66- .66- 71- 
е ќа ps gO „11 . 80 .98 91 .79 зе, 
. 89 «14 Ji 280 
j E ."4- .73- 14- к өз 
E m Ба ii .82 .74  .82 .94  .87 . 
3 .83 .82 56 оз 
| 7 .15- ,75- 19- 82 
E 5 Ө d .94  .84 .82 
.82 .79 82 
S 287 264 288 .48 19 . 
.66- .69-  .66- 
"s 275 210 M P o2 80 
Zu 71 69 
5 ‚14 .62 269 32 .69 
60- .63- 60- 
р 218 ‚64 .68 FT E. .85 
-63 .69 6 
5 286 .90 287 267 .88 : 9 
қ 91- .90- E 
P .91 .94 .92 90. < 4 55 | 
93, 
5 .19 .6 .69 31 ED 91 .93 
: .62- 69- B 
” 7 à id | p» .95 .88 -85 
.69 ,79 279 
* Table entries indicate proportion of correct responses, 
арор Series (S) teams, predicted overall 


proficiency is com 
b For Parallel (P) teams, predicted overall Proficiency isi 
where L denotes leader, M, and М, denote redundant memb 


Puted using the multiplicative law of probability. 5) 
omputed using the formula: M, ((L«M; (1.00 ve 
ers, 


administration of the CTA and SSHA shi 


owed that on à -mance? 
the second testing of the CTA, 70 percent of the stu- the quiz Would be and Some individual perform un" 
dents scored higher than in the first testing Session Were Subtracteq" (i 
(13% scored lower, and 17 % showed 


ained Г 
(i.e., the team containe т, 

10 change); on dap membership roles), However, it was Det 
the second testing of the SSHA, 44 percent of the stu- "T Predictions based upon the model желе d imo 
dents scored higher than in the first testing period bs fully only in certain circumstances invo да 
(47 0 scored lower, and 9% showed no change in ements in individual member proficiency 
scores). See Table 3 


Р tance" 
» les: in circums 
volving profici 5 completely ir 
SUMMARY AND CONCLUSIONS 


i iency in nd decrem en "ies 
vec half Series and irate iones and in en all 
1 in cies Parallel teams (see Table 1), and 0 оп 
Tue rone winch WS adopted (2) tor describing in Circumstances involving the role of leader "e 
information processing activities within 3-man ве- team Proficiency (see Table 2). Specifically), | an 
Ein ага геад е кестен е oer limited con. aut Potheses Which were set up to be e Ane 1 
[sepes d in a series team all o “cording to pie Classroom Situation had the foll ошев!# g 
МАЛ TS had to per- 8 of Confirmation, The notion in НУРО ne Wf 
келст Gane have a high Score Concerning Seri | {в оуег tin teal 
à quiz and all individual performances Were « on supported les team increments series Ud" 
i М b € ''addeg» 1164 somewhat. but the notion of reget д, 
(i... the team contained non-redundan membershi Superior; T E ucture Тё® е 
roles); in a parallel team only the leader haq Тр ire À Over the parallelteam struc О 
final word or action as to what the t е 


- t supP ре 
Drofie was no 
eam product on Ypotheses 2.3 iciency levels 


in 

am e i 
» and 4 were not supported nts 

Hypothesis 5 concerning incre" 
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TABLE 3 


INDIVIDUAL PROFICIENCY LEVELS BEFORE AND AFTER MEMBERSHIP ON SERIES AND PARALLEL TEAMS* 
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Performance on 
Quizzes After 


Performance on 
Quizzes Before 


Performance on 
Final Exam 
First 


Second Gain 


Student * Membership Membership Gain 
Testing Testing 
1 .43 .88 .46 .33 .85 52 
2 .85 295 .10 ‚21 .88 .61 
3 .68 76 .08 ‚21 12 245 
4 ‚56 .89 „88 237 . 86 ‚49 
5 .80 98 18 .35 .88 53 
6 „52 .82 30 .3" .69 „32 
7 412 .83 fil .30 .68 .38 
8 ‚15 .90 15 .43 .89 .46 
9 .55 „19 .24 .35 .74 .39 
10 ‚13 . 90 wf .45 .86 .41 
11 53 .70 17 34 E .28 
12 13 89 .16 17 .89 72 
13 .68 .86 18 35 .10 .35 
14 .49 .60 ld ‚40 . 60 .20 
15 е .86 .95 .09 .30 ‚83 .53 
16 .55 .94 .39 .24 .96 12 
17 .65 . 98 „38 .38 .99 .61 
18 .80 .95 15 . 26 .96 10 
19 52 83 .31 17 .75 58 
20 ‚12 .93 .21 .36 .10 .34 
21 ‚75 ‚10 None .38 .65 .21 


* Table entries indicate proportion of correct responses. 
a The data of two students from the disbanded team are not included. 


ncy from the beginning of the course 
to the end of the course was supported by the data 

{гот the multiple quizzes and exams. Hypothesis 6 
was partially supported when à majority of students 

showed an increase intheir scores on a critical think- 
ing test but not on a study-habits survey. 


individual proficie: 


In conclusion, the partial failure of ie Diese 

to verify hypotheses which were derived from 
a ы experimental studies on team performance 
аша not detract from the fact than an extrapolation 
of reinforcement theory from the laboratory to the 


classroom situation was made successfully and the 
method used to test the hypotheses was effective in 
raising individual student’s proficiency levels. For 
instance, while our study did not follow an earlier 
analysis (2) of the operator of leadership position 
in series versus parallel teams, members of both 
types of teams in our course made substantial gains 
on the assigned material. Thus, the primary edu- 
cational and behavioral objectives that were set up 
were achieved in spite of the incomplete success con- 
cerning the secondary objective of testing hypothe- 
ses. We are currently employing the team-testing 
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technique as part of our regular semester psycholo- 
gy course which meets for three sessions a week; 

the first session is the instructor's lecture; the sec- 
ond is student recitation and discussion; and the third 


is an examination period which employs the series 
and parallel team structures. 


FOOTNOTES 
1. The present study was made possible throughthe 
assistance and cooperation of Dr, John Riggs, 
Dr. Ray Cattani, Dr. James Scoresby, and 
Mr. Jerrell Ferguson. 
2. 


Instructions which were given to the students con- 
cerning series and parallel team functioning 
and the “гшев” for processing the quizzes in 

the teams tended to be lengthy and are omitted, 


You can receive copies of these instructions 
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ә 


by writing to Dr. Jon Е. Roeckelein, Depart- 
ment of Psychology, Arizona State University, 
Tempe, Arizona 85281. 
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TEST STATISTICS AS A FUNCTION 


OF ITEM ARRANGEMENT 


DAVID M. SHOEMAKER’ 
Oklahoma State University 


ABSTRACT 


Using a technique that controlled exposure of items, the investigator examined tne effect on mean test score 
item difficulty index, and reliability and validity coefficients of the reordering of items within a power test con- ! 
taining ten letter-series-completion items. Tne results suggest that effects оп test statistics from item rear- 
rangement are, generally, minimal. The implication of tnese findings for test designs involving an item sampl- 
ing procedure is that performance on an item is minimally influenced by the context in which it occurs. 


ITEM SAMPLING, a procedure involving the 
planned confounding of subjects and test items, has 
been demonstrated empirically (4,5, 8) and algebra- 
ically (7) to be a valuable experimental design in 
test research. However, one critical assumption in 
item sampling is that performance onan item is min- 
imally influenced by the context in which it occurs. 
This assumption has ostensibly been evaluated in 
several investigations (e.g., 1,6,9) which have di- 
rectly or indirectly examined the effect on test sta- 
tistics of rearranging items within a test. The stan- 
dard procedure for manipulating item sequence has 
been that of reordering items on the printed page. 
The assumptions are (a) an examinee responds to 
items in the order in which they appear on the print- 
ed page, and (b) after responding to an item, anex- 
aminee does not give it additional consideration. For 
power tests, both assumptions are questionable and 
suggest that the traditional procedure should be mod- 
ified to eliminate uncontrolled item review on the 
part of the examinee. The point to be made is that 
any investigation attempting manipulation of test item 
arrangement must incorporate a procedure which 
strictly controls the exposure of test items to exam- 
inees. 

An additional consideration in examining test sta- 
tistics as a function of item arrangement is the con- 
tent of the items. It is hypothesized that item rear- 
rangement will significantly influence test statistics 
for those content areas where anitem solution is con- 
tingent upon the generation of conceptsor algorithms. 
Contrast, for example, a set of vocabulary items with 


a set of letter-series-completion items (sample item: 


DABEDCFEFG ?) of the type found in the Thur- 
stone Letter Series Completion Test. Determining 
the appropriate solutions for vocabulary items does 
not appear, at least intuitively, to depend upon the 
generation of algorithms; however, results obtained 
by Simon and Kotovsky (10) on theacquisitionof con- 
cepts for sequential patterns strongly suggest that 
the opposite is true with letter-series-completion 
items. As algorithms vary in complexity, it may be 
argued that experience withlesscomplex algorithms 
will facilitate the generation of more complex algo- 
rithms. 


Using a technique which controlled item exposure 
to examinees, the present investigator examinedthe 
effect on mean test score, item difficulty index, re- 
liability, and validity coefficients of the reordering 
of items within a test. The items selected for con- 
sideration were of the letter-series-completion type. 


METHOD 


From results obtained in a series of item analy- 
ses of letter-series-completion items, a set of ten 
items (referred to hereafter as the Letter Series 
Test) was selected. Difficulty indices for the items 
were rectangularly distributed with a range of . 167 
to.917. Four experimental forms of the Letter Se- 
ries Test were constructed. In Forms 1,2, and 3 
the items were arranged within the test booklet with 
one item per page; in Form 4, all items were on 
one page. In Form 1, the items were Sequenced in 
order of ascending difficulty; Form2, in descendin 
difficulty; Form 3, randomly sequenced; and Кош 
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4, same sequence as Form 3. After a brief intro- 
duction involving five practice items, the instructions 
were as follows: 


There are ten items in the test-one item on 
each page. Work the items in the order in 
which they appear in the test. After working 
an item, fold that page behind the test and 
proceed to the next item. Do not return to 
any item after you have once worked it or 
have tried to work it. If you are unable to 
work a particular item, fold that page be- 
hind the test and proceed to the next item. 
If any answer to an item would just be a 
‘‘wild guess” on your part, skip that item 
and go on to the next one. There isnotime 
limit on the test. 


Identical sample items we: 
of the Letter Series Test. For Form 4, however, the 
instructions were modified to exclude reference to 
items being printed on Separate pages, 


re used in the four forms 


Four classes of college Students from introducto- 
ry courses in psychology, sociology, and education- 
al psychology were selected as Ss. In three of the 
four classes Forms 1,2, and3 were distributed among 
students in an alternating fashion, in the fourth class 
only Form 4 was administered, (The undesirable 
confounding of Form 4 witha Specific group of exam- 
inees was necessitated by the nature of the instruc- 
tions on the test booklet.) A 20-item Number Series 
Test (sample item: 2468 11 13 15 ?) having а 6- 
minute time limit was also administered to each ex- 
aminee. Administering the Number Series Test to 
all examinees permitted an ad 


iditional technique of 
data analysis, namely, analysis of Letter Series tests 
results by subgroups homogeneous (above median, 


below median) on Number Series Test, While both 
tests were administered to all examinees, the order 


of test administration was counterbalanced across 
classes. 


No unusual circumstances aro: 
ministration of any test. Exami 
follow the instructions as outlin 


RESULTS 


se during the ad- 
nees appeared to 
ed іп the test booklet. 


Several analyses of variance (ANOVA) were per- 
formed on the Number Series and Letter Series test 
scores and, as the significance levels of the comput- 
ed F statistics were generally 6reater than 15 per- 
cent, the ANOVA results can be briefly Summarized 
as follows: (a) For the Number Series Test Scores 
no significant differences were observed between” 
classes, between positions of administration in the 
test battery, or between sexes, Identical analyses 
involving Letter Series Test scores Produced simi. 
lar results. (b) Using proportional cell fr, 

a 4x2 analysis of variance of Letter Series 
es was performed involving experimenta] 
and groups (above median, below median 
Series Test). Both the forms (ғ, 


Test scor- 
test forms 
on Number 
„1247 1+ 684, , 25> 
p >. 15) and the forms X group interaction (F 


" 3 d 
. 953) effects were judged to be nonsignificant, " ы 


group effect (Fy 1247 9. 523, . 005 > p> - 001) was 


judged to be significant and considere 


- dered а confir 
of the method of grouping examinee 


ы mation 
S into abo 


уе-те- 


equencies, 


dian and below-median groups on the basis of the 
Number Series Test score. 
An item difficulty index ( proportion answering the 
item correctly) was computed for each item in each 
experimental test form for (a) the total sample, (b) 
each sex, and (с) above- median examinees and be- 
low-median examinees on the Number Series Test. 
Mean test score, standard deviation of test scores, 
and Kuder- Richardson Formula 20 reliability coeffi- 
cients were computed for each experimental test form 
in each analysis. As similar results were obtained 
in all analyses, only the results for the total sample 
are given in Table 1, The inter-form intercorrela- 
tions among item difficulty indices for each experi- 
mental test form for the total sample are Given in 
Table 2. The validity coefficients for each form of 
the Letter Series Test are given in Table 3. In addi- 


tion, the regression equations for each experimental 
test form on Number Series Test scores were com- 
puted, 


| 
TABLE 1 


913 
2 (.792) 875 950 - 972 ‚913 
3 (. 708) 815 625 . 667 826 
4 (. 583) 650 ^ 600 500 .522 
5 (.500) 675 |700 .694 . 696 
8 (.417) 650 450 500 . 609 
7 (.292) 800 „675 150 957 
8 (. 250) 350 350 . 417 . 609 
9 (. 208) -500 475 ‚411 ‚652 
10 (.167) 650 6% .500 565 
z 40 40 36 23 

6. 98 6. 43 6.36 7.30 
SD 2.13 1.99 1.90 1.88 
KR20 -668 573 2509 2557 

qum index of item from item-analy- 

experimenta] test nr" Construction of 10-item 


The ind 
г Ж " 
vidua] regression equations are as follows: 


Ф 
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N=. 417 L, + 8.871 
N=. 264 L, + 9. 904 
N = .735 L, + 6. 284 


Since each examinee has taken both Number and Let- 
ter Series tests, scores on the experimental test 
forms could be compared by means of the analysis 
of covariance procedure described by Gulliksen and 
Wilks (3). 


TABLE 2 


INTER- FORM PRODUCT-MOMENT CORRE LATION 
COEFFICIENTS FOR ITEM DIFFICULTY INDICES 
PER EXPERIMENTAL TEST FORM FOR TOTAL 
SUBJECT SAMPLE 


Experimental Test Form 


1 2 3 4 
1 .857 .862 Т1 
2 .936 ‚123 
3 .889 
4 
TABLE 3 


VALIDITY COEFFICIENTS: PEARSON PRODUCT- 
MOMENT CORRELATION COEFFICIENTS FOR 
EXPERIMENTAL FORMS OF LETTER SERIES 
TEST (1) WITH NUMBER SERIES TEST (N) 


^ "NL (wy) 
Test 

Form 1 40 . 534 8.724 4.537 
Form 2 40 ‚405 4. 248 3. 960 
Form 3 36 .213 5. 576 3.610 
Form 4 23 .527 6. 836 3.534 


errors of estimate, equal 
i i 2 lintercepts were 

3 on line slopes, and equa! І i 

Тен; Ше significance levels of the obtained dif- 

ferences were, in each case, greater than .30. The 

item difficulty indices per item position across ex- 

perimental test form are given in Table 4. 


The hypotheses of equal 


DISC USSION 


no significant differences in Number 
ЖАР. m. posed ire observed between classes 
suggested that treatment effects were not peng con- 
founded with classes. Data were then poole serous 
classes to increase the sample size per ырык aN 
tal test form and hence the stability of the compu 


TABLE 4 


ITEM DIFFICULTY INDICES PER ITEM POSITION 
IN LETTER SERIES TEST 


Item Total Males Females 
Position Sample (N = 62) (М = ТТ) 
(N = 139) 

1 . 148 . 790 2714 

2 .604 .613 .597 

3 568 - 613 532 

4 784 774 192 

5 532 565 506 

6 698 .694 701 

7 . 633 ‚ 645 623 

8 669 . 694 649 

9 770 ‚190 153 
10 .691 .671 . 101 


test statistics. Another possible confounding was 
position of experimentaltest inthetestbattery. How- 
ever, for the Letter Series Test the mean test score 
for those examinees receiving the test first in the 
battery was not significantly different from the mean 
test score of those receiving the test in second posi- 
tion. A similar result was obtained for the Number 
Series Test. The possibility, then, that test perfor- 
mance on the second test in the battery was facilitat- 
ed by taking the first test was not supported by the 
data, Furthermore, when item difficulty indices were 
computed for each item position across experimental 
test forms, items occurring toward the end of a test 
were not answered correctly more often than items 
appearing at the beginning of a test. The effect on 
mean test score of item rearrangement was found to 
be minimal; however, the trend consistently found in 
all analyses was that of Form 4 having the largest 
mean test score followed consecutively by Forms 1, 
2, and 3, Form 4, it will be recalled, was the ex- 
perimental test form with all items printed on one 
page; thus, it was possible for an examinee to modi- 
fy a previous answer based on any insights acquired 
in taking the test. 


One noticeable effect regarding items was the change 
in item difficulty indices between the pilot study and 
the principal investigation. Items found to be most 
difficult in the pilot study were generally not as diffi- 
cult in the main investigation. It should be noted that 
this finding, while interesting and undoubtedly a re- 
flection of sampling bias, does not in any way qualify 
the results obtained in this study. The item difficul- 
ty indices in Table 1, specifically those computed for 
items 3,6, 7,8, and 9, suggest that, when item ex- 
posure is strictly controlled, the item difficulty in- 
dex for certain items may be a function of the posi- 
tion of that item within the test. 
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THE COHORT-SURVIVAL RATIO METHOD 


- IN THE PROJECTION OF SCHOOL ATTENDANCE 


WILLIAM J. WEBSTER 
American Institutes for Research 


ABSTRACT 


The purpose of this study was to test the hypothesis that educational attendance projection problems could 
be formulated in such a manner that the regression model could be used in the analysis thereof. Two projection 
approaches, the Cohort-survival ratio approach and a comparable regression approach, were compared using a 
sample of twenty-five Michigan school districts. The regression approach provided significantly better estimat- 


es than did the Cohort-survival ratio approach. 


THE ACUTE need for educational planning has 
long been realized. One essential facet of this plan- 
ning involves the estimation of numbers of pupils to 
be housed at relevant future dates. Accurate enroll- 
ment projections would greatly assist educators in 
making important decisions affecting future educa- 
tional and fiscal allocations. However, despite the 
numerous and important uses of accurate enrollment 

rojections, the formulae currently most used for 
projection purposes display a notable lack of desired 


precision (4,5). 


The purpose of the present study was to test the 
hypothesis that schoolattendance projection problems 
can be formulated in such a manner that the regres- 
sion model can be used to obtain more accurate esti- 
mates of future public school enrollment than can be 
obtained from the most popular ofthe ratio projection 
methods currently used by educators. The distinc- 
tion between ratio and regression analysis methods 
is made on the basis of the measure of relationship 
used in the computation of projected enrollments. 
Ratio methods use the ratio or proportion of a given 
predictor to the past, while regression methods de- 
pend upon coefficients of correlation and multiple cor- 
relation. Such ur ase ee i ue a = i 

iation, and provide projection 0 
of ase frat has been established in the past. It is be- 
lieved that this procedure represents anew € mun 
method of estimating future public school educational 


attendance patterns, although similar procedures 
have occasionally been experimented with in higher 
education (1). 


METHODOLOGY 


The most popular as well as the most accurate of 
the enrollment projection techniques currently used 
by educators for projecting future public school en- 
rollment is the Cohort-survival ratio method. This 
method depends upon the relationship of birth-to- 
grade and grade-to-grade statistics for a number of 
years. Predictor variables are births and past ten- 
dencies of school children to advance from one grade 
to the next. The Cohort-survival ratio method was 
one of the projection methods examined inthis study. 


A regression analysis, utilizing the same predic- 
tor variables as the Cohort-survival ratio method, 
was also employed. 


Cohort-survival ratio analysis and regression 
analysis were used to develop projections of elemen- 
tary (K-8) and secondary (9-12) enrollment for a 
sample of twenty-five school districts for the years 
1965 and 1968. Projections were based on data col- 
lected from the years 1955-1960. Actual enrollment 
during the years 1965 and 1968 was used as the cri- 
terion against which to judge the effectiveness of each 
of the two projection methodologies. The procedures 
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used in obtaining projections from the two models 
are outlined below. 


1. The actual number of births by place of resi- 
dence by year for the period 1950-1960 were obtain- 
ed for each district in the sample. 


2. The actual enrollment figures by grade for 
kindergarten through twelfth grade for the years 1955- 
1960 were obtained for each district in the sample. 


3. The mean number of births by place of resi- 
dence for the years 1952-1960, 1948-1951, 1955- 
1963, and 1951-1954, was computed for each district 
inthe sample. These dates correspond to the birth 
dates of children in elementary school in 1965, in 
secondary school in 1965, in elementary school in 
1968, and in secondary school in 1968, respectively. 


4. For the Cohort-survival ratio method, the av- 
erage ratio of survivors from birth, 5 years earlier, 
to kindergarten; and from grade to successive grade, 
was computed for each district in the sample. Ratios 
were based on data corresponding to the period 1955- 
1960. This procedure resulted in thirteen Separate 
retention ratios, one corresponding to the average 
percentage of students *'surviving" the transition in- 
toeach successive grade, kindergarten through twelve, 
According to the nature of desired projections (ele- 
mentary or secondary enrollment ) and the year to 
which they were to be made, one of the means refer- 
red to in step three was chosen as a Starting point. 
(For example, the mean number of births by place 
of residence for the years 1952-1960 was used asa 
Starting point in projecting elementary enrollment to 
1965.) The average retention ratio of kindergarten 
enrollment to births 5 years earlier was multiplied 
by the relevant mean number of births to obtain a 
projection of kindergarten enrollment in the desired 
year. This estimate was then multiplied by the av- 
erage retention ratio of kindergarten to first grade 
to obtain a projection of first grade enrollment inthe 
desired year. This procedure was repeated seven 
additional times; 1 to 2, 2 to 3, 3 to 4, 4to 5, 5 to 
6, 6to 7, and 7 to 8. Each time different input var- 
iables were used to obtain projections of enrollment 
in each of the nine elementary grades (K-8). 


If secondary enrollment was being projected, the 
same procedure was used to obtain an estimate of 
eighth grade enrollment. Starting with the appropri- 
ate mean number of births by place of residence, nine 
retention ratios, one corresponding to transition into 
each of the grades, K-8, were successively progress- 
ed through to obtain the required eighth grade pro- 
jection. Using that projection as a base, enrollment 
was projected to ninth grade. This procedure was 
repeated three additional times; 9 to 10, 10 to 11, 
and 11 to 12. Each time different input variables 
were used to obtain projections of enrollment in each 
of the four secondary grades (9-12). 


The separate projections were finally summed 
over grades to obtain the relevant projections of е1- 
ementary (K-8) and secondary (9-12) enrollment. 


In summary, for each grade: 


E, С Esi (Roi to "i 


Where 


E. 7 Grade n enrollment in year x, 


En.1 7 Grade n-1 enrollment in year x, 


R š jera i 
n-1 to n ^ The average retention ratio denot- 
ing the average survival rate of 


Students from grade n-1 to grade 
n for the period 1955-1960. 


Note 
dence were can numbers of births by place of resi- 

У as the basis for project 
only estimates of element pr nee Шаша 
ary enroll- 
evel projec- 


про of the entire 
TY projection, rather than as 


» the same pre- 
ere used. Instead of 
squares criterion was 
ations of the form Y = a, 
ОГ transition int à 
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уе. 


о each 
Once thes e 
е5 were iden- 


In Summary, [ог each grade: 
E =a 
" *b (E, 1) 
Where: 


E = 
n 7 Grade n enrollment in year Ж 


a = The y interce 
Y TCept of the line defir 
ten onship between the dleta а 
variables for the Period 1955 2 960. 
bz The Slope of t | 
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3. School districts classified in Stratum C are 

located in communities that have experienced an in- 
"T rease of 10 percent or more, but less than 50 per- 
22 cent in both general population and number of house- 
ab ‘holds during the decade 1950-1960. 


4. School districts classified in Stratum D are 
located in communities that have experienced an in- 
- crease of less than 10 percent in both general popu- 
Jation and number of households, or, an increase of 
less than 10 percent in one of the classification var- 
iables, and an increase of 10 percent or more in the 
other, during the decade 1950-1960. 


5. School districts classified in Stratum E are 
located in communities that have experienced a de- 
- crease in either or both general population and num- 


ber of households during the decade 1950-1960. 


The final sample consisted of twenty-five school 
districts, five drawn randomly from each of the pre- 
viously described strata. 


Goodness of fit between estim ated and actual school 
district enrollment in 1965 and 1968 formed the ba- 
sis for evaluation. Each of the twenty-five districts 
that comprised the sample was characterized by eight 
separate estimates of different aspects of its enroll- 
ment, including projections of elementary enrollment 
in 1965, secondary enrollment in 1965, elementary 
enrollment in 1968, and secondary enrollment in1968, 
by Cohort-survival ratio analysis and by regression 
analysis. Goodness of fit was determined by sub- 
tracting the actual enrollment figure from the pro- 
jected enrollment figure for eachof the aforementioned 
estimates, dividing each by its actual enrollment fig- 
ure, and reporting the resulting statistic, the W co- 
efficient, a measure of the percentage of error in 
each projection. The smaller the W coefficient, the 
better the estimate. 


In order to gain insight into the relative **good- 
певв” of the two projection approaches, W coeffi- 
cients were ranked within each of the four estimates 
by district. By examining these ranks it is possible 
to make some inferences concerning the relative ef- 
ficiency of the two projection methods for estimating 
different aspects of public school enrollment within 
districts and strata. In addition, W coefficients were 
summed across the four sub-category estimates with- 
in districts to obtain composites. These composite 
w coefficients were also ranked. 


A simple measure of the percentage oferror pres- 
ent in each projection, termed the W coefficient by 
the author, was chosen over a chi-square compari- 
gon of frequencies for a number of reasons. First, 
a simple descriptive statistic was all that was needed 
to describe the relative goodness of the two approach- 
es. There was no interest at this point in determin- 
ing whether or not a given projection differed signif- 
icantly from a population value, or, w hether the 
formulae performed differentially within strata. Due 
to the unique aspects of the projection problem, inthe 
first case we would have been placed in the undesir - 
able position of wanting to fail to reject a null hypoth- 
esis, and in the second case we would have been fac- 
ed with an extremely small N. Second, chi-square 
requires that observations be independent both within 
and between groups. Though projections are cer- 
tainly independent between districts, the same equa- 
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tions were used within districts to obtain more than 
one projection. For this reason within district pro- 
jections were lumped together through a composite 
W coefficient and ranked, thus eliminating the prob- 
lem of lack of independence within districts while 
maintaining the independence between districts. Fi- 
nally, the W coefficient is more easily interpretable 
in the context used here, as it presents simple per- 
centage of error while preserving the direction of 
deviation, information that is lost with chi-square. 


Since only two projection methodologies were ex- 
amined in this study, the only possible ranks for 
each projection within each district are a rank of 1 
or 2. All possible independent observations can 
thus be conceived of as falling into either one ог the 
other of two discrete classifications. Since it had 
been hypothesized that regression analysis would 
yield more accurate estimates of future public school 
enrollment than would Cohort-survival ratio analy- 
sis, the important question became one of whether 
the probability associated with the number of **1”? 
ranks received by regression analysis was greater 
than could be expected by chance. The Binomial Test 
was used to test the hypothesis. 


QUALIFIC ATIONS INHERENT IN THE STATISTIC AL 
PROJECTION METHODS UTILIZED 


The major qualification that must be stated for 
both Cohort-survival ratio analysis and regression 
analysis involves the basic tenet that the environ- 
ment in the period for which projections are made 
must remain similar to that of the period from which 
data were drawn. To the extent that this assumption 
is violated, enrollment projections will be in error. 


Regression analysis requires several additional 
assumptions. The most important of these states 
that the line of best fit which specifies the relation- 
ship between the predictor and criterion variables 
be a straight line. To the extent that the data devi- 
ate from linearity, projections derived from the lin- 
ear model will be in error. This assumption is not 
seen as a problem in the present study since most 
functions can be best approximated by a straight line 
over a short period of time or small number of data 
points. 


A second basic assumption underlying the appli- 
cation of regression methods is that the error terms 
in the regression model are independent. Cochrane 
and Orcutt have demonstrated that the usual applica- 
tion of standard least-squares methodology to rela- 
tionships containing high positively auto-correlated 
error terms, while producing unbiased estimates, 
results in a marked decline of the variances of both 
the correlation coefficient and the regression coeffi- 
cient as the error terms become more random (2). 
This phenomenon results in a serious underestimate 
of true error variance in cases where a high degree 
of auto-correlation is present. Given the fact that 
there is a high probability of auto-correlated error 
terms in time-series data, it is doubtful that confi- 
dence intervals computed around the present projec- 
tions in the traditional least-squares manner would 
add significantly to the available data. Since the 
projections themselves are not biased by the exis- 
tence of auto-correlated error terms, the problem 
is reserved for future study. Readers interested in 
the problem of constructing meaningful confidence 
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3. The W Coefficients for the 1965 elemen- 
tary projections (W); 
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tions yielded by the other method (R=lor2 k 
5. Projected Secondary Enrollment, 1965 
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